w3c / aria

Accessible Rich Internet Applications (WAI-ARIA)
https://w3c.github.io/aria/
Other
640 stars 124 forks source link

Requesting new feature for label synonyms #1038

Open cookiecrook opened 5 years ago

cookiecrook commented 5 years ago

Most accessibility labels are focused on the text-to-speech (TTS) usage, or the name that should be spoken for a blind screen reader user. For example, a "Get Mail" button in a web mail app image button, or a full context label "August 22" on a "22" text button label for a web calendar app.

In addition to the text-to-speech (TTS) usage, there's a need for speech-to-text (STT) label synonyms or alternatives. For example, a sighted user of voice recognition software may prefer or expect to say the less-verbose label, "22" instead of the longer "August 22." Likewise, for image buttons, the sighted STT user may guess a different button name (e.g. "Tap 'Check Mail'" vs "Tap 'Get Mail'"). These users usually have a way to display the specific label in cases of ambiguity, but we'd also like it to work without forcing uses to memorize specific button names in every web app.

I'd expect the primary consumers of this new Web API would be speech recognition software or a full keyboard access "typeahead" behavior.

Here's the admittedly under-documented accessibilityUserInputLabels API in iOS 13 developer docs. It's an optional array that accepts one or more synonyms to the element's primary accessibility label.

I'd like the ARIA group to consider something similar as new Web API. IMO, it does not need to be a declarative content attribute, as the relevant string splitter may be ambiguous or problematic for content attribute.

It might be most appropriate as a non-reflected array property on the DOM element (perhaps Element.accessibilityLabelSynonyms), but I wanted to give the ARIA WG the first opportunity to suggest what they think is best. ARIA seems like the most-likely place for this to be specified, even if it does not end up being a content attribute.

Thanks.

cookiecrook commented 5 years ago

An example of the scenario I mentioned above. The screen shot shows the VoiceOver cursor selected on a button with a visually minimal label "23" which is clear to the sighted user as "Friday, August 22" based on the visual proximity to the "F" above, an the "August" section heading. The default accessibility label (e.g. for VoiceOver) is "Friday, August 22" but the label synonym (e.g. for Voice Control) is "23." In this scenario, a user could speak the logical "Tap 23" instead of the more tedious voice path: "Show names", then "Tap Friday, August 23."

Screen shot of iOS calendar showing visually minimal buttons to select a different date (e.g. 'F 23' instead of 'Friday, August 23')

jnurthen commented 5 years ago

For things that have text children it would seem that the voice label would normally be just the text content. This is certainly what DNS has done for years where it allows a user to speak either the child text content OR the calculated accessible name (with some implementation bugs to be sure as they don't use the a11y APIs for this). Allowing extra labels over and above this seems potentially useful.

TPAC conversation?

cookiecrook commented 5 years ago

For things that have text children it would seem that the voice label would normally be just the text content.

It's not always apparent how to pronounce some text labels, especially those with punctuation, emoji, or glyphs. For example, an interface with ⬇ in the rendered text label may be spoken as "down" or "down arrow" not the specific unicode string "downwards pointing arrow." Bonus points to any sighted reader who knows how the following Emoji symbol would be spoken. 🔂 It's not "repeat 1" which is what I'd call it.

This is certainly what DNS has done for years where it allows a user to speak either the child text content OR the calculated accessible name (with some implementation bugs to be sure as they don't use the a11y APIs for this).

Sure. And substrings too, but… I'm sure we can come up with another example that doesn't match either of these directly.

Allowing extra labels over and above this seems potentially useful.

"Potentially" makes you seem doubtful. Is that really the case?

TPAC conversation?

I'll be there.

cookiecrook commented 5 years ago

I'm sure we can come up with another example that doesn't match either of these directly.

In the above example, "Tap Friday" would only work in the accessibility label substring context. "Tap 23rd" or "Tap twenty-third" would not work with an voice control software I'm aware of.

jnurthen commented 5 years ago

Wish github had a did you really mean to close this issue feature :)

jnurthen commented 5 years ago

It's not always apparent how to pronounce some text labels, especially those with punctuation, emoji, or glyphs. For example, an interface with ⬇ in the rendered text label may be spoken as "down" or "down arrow" not the specific unicode string "downwards pointing arrow." Bonus points to any sighted reader who knows how the following Emoji symbol would be spoken. 🔂 It's not "repeat 1" which is what I'd call it.

In cases where emoji/glyphs are unclear it is probably best to add a better label for TTS users too in which case a simple aria-label on a button could alleviate the issue

Sure. And substrings too, but… I'm sure we can come up with another example that doesn't match either of these directly.

I'm sure we could come up with something.

"Potentially" makes you seem doubtful. Is that really the case?

It seems like there is the possibility that this could be a useful feature. To make this happen many things have to come together including commitment to implement from AT which would use this feature.

Aside: In my wishlist for speech input a "what can I say" feature which would show what a user should say in order to press a graphical button is higher than the ability for a developer to guess what a user would want to say.

I'll be there.

Awesome - see you in a few weeks.

cookiecrook commented 5 years ago

@jnurthen wrote:

To make this happen many things have to come together including commitment to implement from AT which would use this feature.

I understand the need to have at least 2 AT and 2 UA implementations before final Rec.

That said, see the intro. iOS 13 implements this already to support AT including Voice Control, so you have commitment in the form of native API. The proposed Web API has broader implication than voice synonyms (keyboard typeahead for example), and other platforms could take advantage, too.

Some of them may already do so. Android has some similar voice features, so may have label synonyms as API or SPI.

ZoeBijl commented 5 years ago

Am a fan of this feature and look forward to your proposal!

cookiecrook commented 4 years ago

I think a comma-delimited string is probably sufficient. It could handle the single- and multi-word cases, including most punctation unescaped.

aria-label="Mail" aria-labelsynonyms="Check Mail, Get Mail, Envelope, Another multi-word label with commas\, and yes\, this example is probably excessive."

jnurthen commented 3 years ago

@cookiecrook still in 1.3 scope?

DanielGoransson commented 2 years ago

Big fan of this feature request! Especially now with WCAG 2.1 and Success Criteria 2.5.3: Label in Name. One other example could be when the interface only provides a visible icon, like a "+" for adding an item, an "X" for closing a modal, or a pen for edit or create new. In these cases a screen reader would like to have some thing like, "Add task", "Close dialog" and "Edit account" announced. While a voice control users may want to say "Press plus", "Press x" and "Press pen".

I also support the idea of a comma separated list of synonyms. Much like <meta name="keywords" content="a,b,c"/>

cookiecrook commented 2 years ago

We had an education partner ask for this recently, so adding a draft PR #1794 to get the ball rolling again.