Accessibility of model - Githubissues

marcoscaceres commented 2 years ago

We need to figure out how to make <model> accessible on a number of different fronts:

Visual: describe what is being presented over time.
Interaction: describe what can interacted with (regions, buttons, etc.).
auditory: describe audio and possibly spoke sounds, potentially over time.

Usually, this would be provide by the embedded format... however, it appears that both glTF and USDZ are quite limited when it comes to accessibility.

As such, it may be that we need to leverage what we can from HTML + ARIA to overcome the shortcomings of these formats. We have quite a bit of precedent (e.g., from the humble, yet limited, alt attribute, to how <canvas> can be made accessibly, to the potential inclusion of <track> elements, and so on).

klausw commented 2 years ago

Side note, the list above is of course not exhaustive. There are additional accessibility considerations apart from adding descriptions, for example support for alternate input methods such as keyboard navigation or voice-activated controls. Also, some accessibility tools on touchscreen devices change how interactions work, for example requiring two- or three-finger gestures for rotation or zooming.

marcoscaceres commented 2 years ago

Oh yes, totally. No doubt there is ton of stuff missing. We need to go over the papers from the inclusive xr workshop and bring in the idiw folks.

cabanier commented 2 years ago

Usually, this would be provide by the embedded format... however, it appears that both glTF and USDZ are quite limited when it comes to accessibility.

That is very unfortunate :-\ Can we file issues on both formats so this can be added?

As such, it may be that we need to leverage what we can from HTML + ARIA to overcome the shortcomings of these formats. We have quite a bit of precedent (e.g., from the humble, yet limited, alt attribute, to how <canvas> can be made accessibly, to the potential inclusion of <track> elements, and so on).

I agree that that's the right approach. There needs to be a way to translate a model into an accessibility DOM so it can be exposed to screen readers.

donmccurdy commented 2 years ago

The nearest feature I'm aware of in glTF is support for the XMP packets is JSON-LD format, including Dublin Core, attached to individual nodes within the model's scene graph. These may include titles, descriptions, and other semantic metadata about that particular part of the model, comparable to "alt" text.

It is not obvious to me that we want every model format inventing its own accessibility standards. I wonder if this could be done at a higher level, through a W3C specification, an XMP namespace, or something else. This would probably have value even outside of the <model/> proposal, as I'm not aware of any standard 3D file formats or authoring tools working on this today. There has been some (very early) work to add namespaces to XMP relevant to 3D models, for things like alignment to vertical or horizonal planes in AR.

joelamyman commented 1 week ago

Hi all, this is my first time contributing to any W3C work, so I hope that my thoughts below are helpful/appropriate. Please feel free to let me know/guide me if not!

In terms of the needs for describing 3D objects, I know that there are a wide range of different contexts of use for a model element. In general, the user need would be that:

As someone with low or no vision, I want to know what the object is and any information contained within it.

With this in mind, it makes sense to have a general description, followed by a way of getting more specific information about the object at different points, designated by the author. A single description might suffice in some cases, but I can imagine that in the majority of uses, there will need to be more descriptions that can accurately describe individual sections of a model.

Scott Vinkle has previously suggested a set of alt attributes for a simple model viewer, which I think is really easy to understand from an author’s perspective, and is nice and straightforward.

However, for more complex models, there may be multiple areas in which an author wants to provide an accessible description. In this case, maybe using an approach similar to the map and area elements might work well? I fully appreciate that it’s more intensive and would require more from an author, but being able to specify a range, or location, in which a text alternative should be provided, might make a model element more flexible, and accessible for different types of use.

An example of this, where such a technique might work well, would be when providing a 3D model of a building with different zones or rooms marked out. An author could specify the different regions, which when the conditions are met, are announced by a screen reader.

Given that the model can be animated, this might work well, and these descriptions could be used as announcements when the relevant part of the model appears in front of the virtual camera. This would help to keep the descriptions of parts of the model consistent, and help people to better understand the composition of the model and how different sections are related and positioned.

Alongside this, for providing a stopgap for the semantics of this component, a role of application makes sense (as suggested by Scott). However, moving forwards, a specific role would be great, and would allow people to have confidence in how to interact with the element. If standardised controls for interacting with/manipulating the element were made available, this could also make it much easier to interact with the element using voice control software.

In terms of considerations for people with moving disabilities, alongside making sure that it works with a keyboard, there would need to be controls that would allow people to interact with the model without pinching, swiping, or using dragging gestures. People might be using alternative input methods, and/or not be able to use gestures such as those that I described. Having a way to interact with the element using a single pointer, without the need to drag, or hold something, will help more people to use of the element.

The user need for this would be:

As someone with a moving disability, I want to be able to interact with the model, using a single pointer, without dragging or having to hold/continually press controls.

In the rough demonstration that I implemented a while back, I created a way for the model to automatically rotate/animate, so that people could access all of the model’s content without needing to move it themselves. Alongside this, there are controls to navigate to specific regions, which is one of the improvements that I think benefits multiple people, as it allows you to view the important pieces of the model, and also have them described to you, without needing to drag or hold down/press an arrow key multiple times.

Another consideration is one that mainly applies to people with thinking disabilities. I’ve seen implementations where a model viewer plays an animation every few seconds to convey that it is interactive. The examples I’ve seen have been a dragging animation in which an icon of a finger appears over the model, or the first few seconds of the animation for the 3D object playing. If a similar approach is taken, then there must be a way to prevent this, as to not distract people when the model is displayed alongside other content. This would be a similar approach to not auto-playing audio or videos.

The user need for this would be:

As someone with a thinking disability, I want to be able to prevent animated content from playing so that I can focus on other content on the page/screen.

Again, I hope that this type of information is useful, and apologies if not! If there is any more detail that would be useful, or if this helps to create any more questions, please let me know. I’d really like to help contribute to this work if useful.

klausw commented 1 week ago

As someone with a thinking disability, I want to be able to prevent animated content from playing so that I can focus on other content on the page/screen.

Anecdotally, I think many users would appreciate an option to turn off animations or to limit them to only animating once instead of continuously. It's one of my pet peeves in modern chat apps. It's worse than the <blink> tag was...

immersive-web / model-element

Accessibility of model #50