w3c / aria

Accessible Rich Internet Applications (WAI-ARIA)
https://w3c.github.io/aria/
Other
645 stars 125 forks source link

Role parity: consider "native-" or "host-" role prefix for host language elements (including audio/video, but not limited to media) where custom ARIA implementations cannot (yet) match the native implementation's accessibility. #529

Closed cookiecrook closed 1 year ago

cookiecrook commented 7 years ago

Consider a "native-" or "host-" role prefix for elements like audio/video that would require complex API support out of scope for ARIA. Authors SHOULD NOT use these in content, but test tools like the WebKit inspector, could return these values where no specific ARIA role matched.

<button> -> button (concrete) <input type="range"> -> slider (concrete) <video> -> native-video, host-video,

[Update: Another option from comment below.]

videoElement.computedRole; // "html:video"
mfracElement.computedRole; // "math:mfrac"

[Update March 2023: or html-video] (like abstract roles, authors should not use in role attr)

cookiecrook commented 7 years ago

At a later date, when it'd be possible to support a fully functioning video role, the user agents could expose it instead.

cookiecrook commented 6 years ago

@joanmarie have you considered role parity for the HTML-embedded host languages like MathML and SVG? In theory, there could be role parity on those as well, but I'm not sure why anyone would want to build a semantic copy of MathML or SVG and then have to render it themselves.

In theory, ARIA could have role parity with MathML and then WebKit+VoiceOver would still be able to expose Nemeth Braille to the user in some MathML-like ARIA structure, though this seems like a lot of work for very little gain. It'd be easier to just map those to reserved roles like native-mfrac or host-mfrac that should not be used explicitly by authors.

asurkov commented 6 years ago

I think I miss the whole idea of 'native-' roles. How 'native' prefix will help to make video/audio controls accessible later, and why ARIA can't reserve 'video'/'audio' role instead.

joanmarie commented 6 years ago

@cookiecrook re MathML parity, see w3c/aria#660. I'll leave it to @AmeliaBR to comment.

Having said that, the ARIA Working Group committed to (read: "promised other groups within the W3C") that we would achieve role parity with HTML for ARIA 1.2 and do so in a timely fashion. And as you may be aware, some folks within the W3C are paying much closer attention to achieving stated milestones. Which brings me to the following:

Working on MathML and SVG and strikes me as desirable -- for, say, 1.3. :smile:

cookiecrook commented 5 years ago

@asurkov among other uses, this could be used to return a standardized computed role for the WebDriver proposal element.computedRole. Even in cases where role parity was not achieved, browser implementations could agree on the returned role value, for the sake of testing the web forward.

cookiecrook commented 5 years ago

Copying @alice in because she asked about this one. She also may have suggested an interesting name-spaced syntax I had not previously considered.

videoElement.computedRole; // "html:video"
mfracElement.computedRole; // "math:mfrac"
AmeliaBR commented 5 years ago

For SVG, the ARIA graphics roles cover basic role parity with native features. The only roles we are currently special-casing in the SVG-AAM are text semantics, equivalent to <p> and named/interactive spans.

In the more general case:

I'm inclined to agree with @asurkov: why not just make these regular roles? A video is a common compound widget that is used all over the web. It is an implementation detail that the end user shouldn't need to deal with whether it is created by a single HTML <video controls> element (and the resulting shadow tree), or by a <video> grouped with author-supplied controls, or by a <canvas> or <svg> animation.

css-meeting-bot commented 5 years ago

The ARIA Working Group just discussed audio/video role parity.

The full IRC log of that discussion <melanierichards> Topic: audio/video role parity
<melanierichards> jamesn: we'd like the rationale behind your strong opposition to audio/video roles
<melanierichards> jcraig: I believe I filed this under the premise of reserved roles, like native-*
<melanierichards> jamesn: we're wondering why we can't have an audio or video role
<melanierichards> jamesn: what's the difference b/t that and everything else in ARIA
<melanierichards> jcraig: each of those comes with API support. When you have a video element, it can be controlled and queried programmatically. Standardized JS APIs for these different interfaces, which can't be queried with a div that has video role on it. Until we get the point where we provide the author a way to say, here I've got these delegates that can respond to different requests for audio/video controller...play key on Mac will bind to play/pause...
<melanierichards> jcraig: ...in native element, but not in div with role on it
<melanierichards> jamesn: sure, but does a11y API have the ability to do that, or just the standard JS APIs? I don't understand how it's different from any other role in that you'd have to write it yourself as a JS developer
<melanierichards> jcraig: trying to solve similar problem with sliders. No way for touch screen to interact with the slider control. All of these other APIs like audio and video have significantly more complexities than supporting a slider.
<melanierichards> jamesn: essentially on iOS, if you have a video element, there are gestures that the a11y API can send to the control (play, pause, increase volume)
<melanierichards> jcraig: not limited to a11y API
<melanierichards> jcraig: however the user wants to control it, that's not something that ARIA can copy
<melanierichards> jamesn: if somebody does this today and creates a video out of something else, which they do, they don't get any of that functionality anyway, and they aren't able to say it's a video. I don't want to encourage it, but if we don't have a role, we can't expose details. If we did have one, we could
<melanierichards> jcraig: when we pass things through the native accessibility role, there are expectations that the user has. A user of a video would expect that the video would respond to their prefs for autoplay, for caption settings, etc
<melanierichards> jcraig: by calling it video, user isn't gaining anything, we're potentially confusing them more
<melanierichards> joanie: aren't we already in the danger of that happening if the author puts aria-roledescription="video" on this hypothetical custom video control?
<melanierichards> jcraig: sounds like a good enough workaround
<melanierichards> jcraig: preferable to an ARIA API that doesn't actually work
<melanierichards> joanie: I was meaning that would already impart confusion on the user, told something is a video but it's not
<melanierichards> joanie: I put in issue #517, a possible other way forward
<melanierichards> joanie: I want to complete role parity to best of our ability. If a screen reader knows that something says it's an audio or video player, they might want to adjust their behavior. A non-parseable roledescription is not going to work
<melanierichards> joanie: we can get partial parity via the group role. I don't think a player is like a div/span, but it is a group with UI components inside. Second part would be not an ARIA role...
<melanierichards> jcraig: reads in issue about aria-playsmedia
<melanierichards> https://github.com/w3c/aria/issues/517
<melanierichards> jcraig: I filed a similar issue a couple years ago. If that were the case it should go on the button instead of the video itself
<melanierichards> jcraig: one of the reasons for 1:1 role parity is to use WPT to ensure browser compat [paraphrased]
<melanierichards> jcraig: if we don't think video is something that an author should use as a role, but we do agree that the native video element should return the same thing from various implementations, return some like native-audio, native-video
<melanierichards> joanie: would not do anything for custom components, right?
<melanierichards> jcraig: for custom components, opening up video API to apply to custom components. If that happens, fine to open up video role
<melanierichards> mck: so you're saying you don't want more of the slider problem
<melanierichards> jcraig: yes
<melanierichards> jcraig: most web devs can make a very accessible video player without using a video role
<melanierichards> jcraig: I've worked on a web player...the native player in WebKit is all rendered in JS
<melanierichards> jamesn: but this applies to anything we're doing with role parity, you can use HTML to use the same experience
<melanierichards> jcraig: main thing that's different about these, these are not implementable
<melanierichards> jamesn: is there any advantage to doing anything with these?
<melanierichards> joanie: aside from letting SRs know what these are, no
<melanierichards> jcraig: with native-*, we can all agree how web platforms should work
<jcraig> s/should work/should be testable with WPT/
<melanierichards> mck: instead of using something like native-* role name to do that, would it be fine for us to give audio/video generic role, and then give it a specific attribute? I don't think namespace sits well with everybody. mediatype attribute, with values like audio or video. or hasmedia, with values audio or video.
<melanierichards> jcraig: people are opposed to defining a native role that would be in the spec. I would expect that the spec would define it when the host language has something that can't be implemented [paraphrased]
<melanierichards> jamesn: maybe in HTML-AAM, say no equivalent in ARIA
<melanierichards> jcraig: probably want to standardize the pattern in ARIA, certainly not the specific roles
<melanierichards> joanie: I don't think it's ARIA's place to define how other languages return roles for automated testing and WPT
<melanierichards> joanie: I think it's a fine approach but not our place to define
<jcraig> q+ to explore the namespaced idea some more
<melanierichards> joanie: the one thing I think we have consensus on is we're not going to do audio, video roles until the API opens up
<melanierichards> (joanie types into issue)
<jcraig> q+ videoElement.computedRole; // "html:video"
<jcraig> ack me
<Zakim> jcraig, you wanted to explore the namespaced idea some more
<melanierichards> mck: I like what Matt threw out, pseudo namespace. html:video, mathml:frac. I know there's been previous objections to namespaces in ARIA
<melanierichards> s/mck/jcraig
<melanierichards> jcraig: I'd be perfectly happy with that instead of native-* prefixing
<melanierichards> jcraig: not say role="html:video", just the response, the computed role
<melanierichards> mck: are you still thinking this falls within the user agent portion of the ARIA spec? If you were going to put something in the ARIA spec, it's not part of IDL, where would it go?
<melanierichards> joanie: it's not going to be in ARIA
<melanierichards> jcraig: talking about this as part of AOM
<melanierichards> jcraig: pretty much every implementer is in agreement that we shouldn't open up computed role to author, too hard to extract
<melanierichards> jcraig: pretty much every implementer agrees would be useful for test context (Web Driver, WPT)
<melanierichards> jcraig: we could have a centralized repository of JS-based tests that would allow ARIA and other spec devs to determine that things are computed identically between browsers
<melanierichards> jcraig: wouldn't be available to the web page to check
<melanierichards> jcraig: where it lands....Web Driver, maybe AOM, maybe ARIA. Hard to say. Role parity is a precursor to that ability, I believe. We have to have parity on what the roles should be before the right tests can compare browsers
<melanierichards> mck: for these elements, if ARIA says there will be no corresponding role, what AOM gets in response to computed role from the browser, what spec would specify what it gets back from the request?
<melanierichards> mck: for us that would end up being nothing to do with audio and video for ARIA 1.2
<melanierichards> mck: thinking about where it should go is important, but some people here don't want to put brainpower behind thinking where it goes
<melanierichards> joanie: if it's not going to be something that ARIA controls or maintains, I'm happy to participate if people want to brainstorm...I don't mean it dismissively, but it's not our domain
<melanierichards> joanie: should be standardized somewhere, I don't know where, but it's not ARIA
<melanierichards> jcraig: a decision is probably required to achieve what we have been calling role parity
<melanierichards> mck: decision could be no corresponding role
<melanierichards> joanie: or group
<melanierichards> mck: but you can't tell the difference from other groups
<melanierichards> jamesn: we don't want to tell people it's a group when it's a video. I'd prefer mapping to nothing
<melanierichards> joanie: I can live with that
<melanierichards> jamesn: testing can deal with what they return if there's no mapping
<joanie> q+
<melanierichards> jcraig: I want to avoid some web authors saying "we made our player accessible because we used the video role" and actually it's a pile of junk. Making this all accessible is non-trivial. And then there's all these APIs like play and pause that are reserved for video controls.
<joanie> ack me
<joanie> https://github.com/w3c/aria/issues/517
<joanie> I believe we're blocked until the native APIs for video and audio are something which authors and ATs can utilize so that custom players behave in exactly the fashion as native ones.
<melanierichards> joanie: added new comment
<melanierichards> joanie: does that capture your concerns?
<melanierichards> jcraig: yes
<jcraig> q?
<jcraig> ack vi
<jcraig> ack //
<jcraig> ack h
<jcraig> ack html
<melanierichards> mck: if an author is making their own player for another reason, the roledescription workaround...they can do that, it won't work like a native video, but it also won't be reported as a native video when you ask for the computed role. So that seems fine to me.
<harris> ack html:video
<melanierichards> mck: we avoid slider problem
<harris> shrug indeed
<jcraig> ack "
<melanierichards> mck: in some spec, probably not ARIA, say what gets returned for the computed role. If we ever put a custom video player in the authoring practices, which I don't think we'll bother to do (though it's in the backlog), we'd use roledescription
<melanierichards> (group agrees)
<melanierichards> jamesn: could close as won't fix, and if something changes we can re-open
<jcraig> I had forgotten that Alice had also suggested the namespaced approach. Evidence: https://github.com/w3c/aria/issues/529#issuecomment-437093886
<jcraig> Github: https://github.com/w3c/aria/issues/529#issuecomment-437093886
<MichaelC_travel> rrsagent, make minutes
<RRSAgent> I have made the request to generate https://www.w3.org/2019/05/02-aria-minutes.html MichaelC_travel
<jcraig> This comment from Nov 2018 is also relevant: "among other uses, this could be used to return a standardized computed role for the WebDriver proposal element.computedRole. Even in cases where role parity was not achieved, browser implementations could agree on the returned role value, for the sake of testing the web forward." https://github.com/w3c/aria/issues/529#issuecomment-437092747
<bgaraventa1979> <a id="test" href="#">$100<span class="dot" aria-label="." ></span>00</a>
<harris> <button><div role="img" aria-labelledby="foo"></div><div role="img" aria-labelledby="bar"></div></button>
<harris> <div id="foo">.</div>
<harris> <div id="bar">.</div>
<harris> https://www.irccloud.com/pastebin/WEFtVvVR/
<melanierichards> here's a fiddle for that https://jsfiddle.net/buLy7vc5/embedded/result
<harris> the above snippet would read "hello ARIA WG world"
<harris> in safari + VO (osx)
<melanierichards> Chromium: "hello ARIA WG world", Firefox on Win: "hello world"
joanmarie commented 4 years ago

I still think we don't want to do this. Unfortunately, we don't seem to have consensus on this yet. So pushing back to 1.3 for now.

cookiecrook commented 4 years ago

Of note, WebDriver is progressing with adding computedRole in Issue 1439 and Pull 1444, so if we're able to reach consensus on this, we should be able to start testing implementations against each other soon.

cookiecrook commented 4 years ago

Clarifying: In either case, we should be able to test implementations for any defined ARIA role and any native element matching one of the parity roles defined in ARIA 1.2. This issue only prevents us from comparative testing of roles where WG consensus has yet to be reached, particularly in native elements like <video>

JAWS-test commented 4 years ago

I think the role prefix (native or host) is not a good idea. At least for video and audio there should be the corresponding ARIA roles video and audio (without prefix).

cookiecrook commented 4 years ago

IMO, it would be inappropriate to add audio and video roles to ARIA until such a time when the respective native media interfaces can also be mimicked with ARIA or another API. For example, the native play/pause or autoplay behaviors cannot be made accessible with ARIA. Likewise, auto-selection of caption tracks or audio description tracks does not work with ARIA-driven methods. As such, it may mislead web developers into thinking that ARIA video and audio could be made as accessible as native audio and video. We've had similar problems with ARIA sliders and scrollbars for a decade; I don't believe we should pile on until the first problem is solved.

The native-* prefix proposal is an attempt at a future-compatible pattern for testability and interoperability between implementations.

JAWS-test commented 4 years ago

Hi @cookiecrook

I don't really understand your concerns.

cookiecrook commented 4 years ago

@JAWS-test wrote:

Why should the operation of the HTML video element not be able to be reproduced by JS? On the PC I control the HTML video element e.g. with the space bar and the arrow keys.

As an example, the <video> element provides support for:

None of this can be "faked" by a web site using JavaScript, because there is no web API defined for such things.

@JAWS-test wrote:

In my opinion, ARIA is there to transmit role and status of elements for assistive technology. A video is a video, whether or not it is a HTML video element.

IMO, conveying a role to a user is both a technical and social contract. We are effectively telling the user, ~"this is the type of control, and therefore you know how to operate it." With most simple controls (like buttons or checkboxes), web author implementation is trivial to fulfill that social contract and meet the user's expectation.

However, no web author is able to fulfill this obligation with even slightly more complex controls, like slider. To this day, there is no way to make an ARIA slider accessible using touch screen controls. Conveying the role leaves the user confused as to why it's not working, and leaves the author confused as to how to attempt making it work. Once you get to very complex controls like tree grid, expectations are even more difficult to fulfill.

My opinion is that for very complex controls like video, we should not compound this problem by telling authors it's "accessible" to use the video role on a custom element that cannot support expectations of video accessibility. Those roles should be reserved until the time when implementations and web standards provide the ability for web authors to make them accessible.

cookiecrook commented 4 years ago

But I think we digress. This issue is not about whether a specific role should exist. It's a pattern proposal that would allow browser implementations to test against each other. Implementation testability leads to implementation consistency, and a path for role parity testability is what this proposal is really about.

JAWS-test commented 4 years ago

I don't think we disgress: if ARIA had the role video, we wouldn't need a role native-video.

If I understand you correctly, some ARIA roles should not exist (slider, treegrid etc.) because they are not operable on touch devices. I also see this as problematic. However, I rather think that here the manufacturers of assistive technology have a duty to implement the support. I.e. when I navigate to a slider with my screen reader on a mobile device, VoiceOver, TalkBack etc. must offer me a wiping gesture, which is internally converted into the key event handler for the arrow keys, because the arrow keys are used to operate the slider. As long as this has not been done, the ARIA APG could offer patterns with support for touch devices (e.g. with the slider, by displaying two buttons to increase and decrease the value or an input field to enter the value)

For videos, I don't see the lack of support by assistive technology as a problem, because the non-native video players can emulate all these functions (e.g. via buttons). This is already implemented correctly in many video players.

cookiecrook commented 4 years ago

@JAWS-test wrote:

I don't think we disgress: if ARIA had the role video, we wouldn't need a role native-video.

There will always be some element that does not yet have an equivalent ARIA role yet would benefit from automated comparative testing across browser implementations. The issue proposal is not specific to <video>. It was just used as an example.

If I understand you correctly, some ARIA roles should not exist (slider, treegrid etc.)

I am not suggesting we remove the existing roles, but by today's standards, we should not add to the list until we can prove implementability.

I also see [support of some mobile interfaces] as problematic. However, I rather think that here the manufacturers of assistive technology have a duty to implement the support. I.e. when I navigate to a slider with my screen reader on a mobile device, VoiceOver, TalkBack etc. must offer me a swiping gesture, which is internally converted into the key event handler for the arrow keys, because the arrow keys are used to operate the slider.

No spec defines this, so there is nothing to implement. The proposal you've listed here (AT simulating key events) has been opposed a number of times.

JAWS-test commented 4 years ago

The prefix-roles could possibly help to end the current wrong output in AT (see for example https://github.com/FreedomScientific/VFO-standards-support/issues/357, https://github.com/nvaccess/nvda/issues/10708). However, I am still in favor of finding roles for native and non-native elements.

cookiecrook commented 2 years ago

Based on a call comment by @jnurthen, I've clarified the title of this issue. IMO, the issue includes but is not limited to media elements like audio/video.

scottaohara commented 2 years ago

should we move this to 1.4? the embed/object one was slated for 1.4, and i didn't get the impression we're itching to work on this for 1.3?

pkra commented 2 years ago

should we move this to 1.4?

This issue was already moved to the 1.4 milestone.

scottaohara commented 2 years ago

Thanks @pkra i must have misread something to think it was still 1.3 like the other.

cookiecrook commented 1 year ago

Now that we're building out computedrole tests in WPT, this may block portions of HTML-AAM testing. Raising it again as an agenda topic related to w3c/aria#1879, w3c/aria#1887, and w3c/core-aam#166.

cookiecrook commented 1 year ago

From @jnurthen or @scottaohara in the ARIA meeting this morning: ~"@spectranaut is working on a PR for Core-AAM, so presumably HTML-AAM could define a similar column for computedrole, and define these in those extension specs"

For extension specs like graphics-aria, that could be graphics-image or for reserved elements in a host language AAM like HTML-AAM, that could be html-video until if/when there was the ability to define a core ARIA role for it.

cookiecrook commented 1 year ago

Also https://github.com/w3c/html-aam/issues/464

cookiecrook commented 1 year ago

Chromium is already returning non-standard computed roles, so a decision around interop would be useful.

<pre> pre element returns computedrole "Pre" in Chromium

<header> outside a direct <body> context header element outside body context returns computedrole "HeaderAsNonLandmark" in Chromium

spectranaut commented 1 year ago

Yeah I was expecting to see "HeaderAsNonLandmark" come up -- this is where it becomes clear that in testing the "computed role", we are testing is an implementation detail.

Ultimately, what is important is whether chrome/safari/firefox expose the correct thing in the platform accessibility API, and the "computed role" is just an intermediary step. Browser might actually want/need different values for "computer role", in some cases like this.

Header is expected to be exposed differently in the UIA and ATK/ATSPI in two contexts, but the same in both context for IA2 and AXAPI:

Since WebKit only exposes AXAPI, they only need the computed role "header" in both cases. But Chrome uses the computed role concepts "header" and "headerAsNonLandmark" to get to the correct mappings for all they APIs they support

What exactly would be a proposed solution to this? IMO it's not testable, unless WebKit wants to add extra logic to calculate something they don't need, or Chrome is going to... majorly refactor their code or....?

cookiecrook commented 1 year ago

FWIW, I think it’s a solvable in WebKit to not expose the header role outside body context. I just think that scenario was overlooked.

I also think it’s solveable that Chromium could hang on to the internal role ENUM they need for platform mappings while still exposing a standard role name to computedrole

jcsteh commented 1 year ago

I'm currently implementing standardised role names for computedRole in Gecko. Like Chrome, Gecko has roles which don't exist in ARIA. My current implementation standardises these where possible, returning the empty string when there's no reasonable mapping. For example, <pre> and <header> when not a direct body child both return "generic" (since they're effectively generic text containers), but <iframe> returns "" (since ARIA doesn't have a concept which fits iframe).

<audio controls> and <video controls> are an interesting case. In Gecko, they currently return "group" because our controls are contained within a group. Even though that's a standard role, the controls are going to be implemented differently in different browsers.

cookiecrook commented 1 year ago

Consensus from the ARIA F2F last week was to use "html-[tagname]" (html-video) in the case of HTML-AAM ambiguous or reserved host-language elements. There needs to be something in the ARIA spec defining that authors should not use host-language prefix+tagname roles (html-video, svg-polygon, etc) in content, unless a relevant AAM defines those as concrete roles. The same note should probably mention dpub-chapter is okay in content because 1) DPUB is not a host language, and 2) those are defined as concrete roles in DPUB-AAM.

jcsteh commented 1 year ago

Do we plan to have a defined list of these html-*, etc. roles? It doesn't seem to me like just prefixing the tag name with "html-" when the role is unknown is particularly useful from an interop standpoint. That doesn't prove anything except the browser's ability to concatenate strings together. :) Am I missing anything?

What should we do for things like <input type="date">? html-input isn't quite specific enough there. html-input-date?

cookiecrook commented 1 year ago

@jcsteh It would not be that many. All those inputs are 'textbox' or similar. From @scottaohara's HTML-AAM PR, the list is: audio, canvas, cite, embed, iframe, kbd, label, legend, map, object, summary, var, video, and wbr

cookiecrook commented 1 year ago

@jcsteh wrote:

It doesn't seem to me like just prefixing the tag name with "html-" when the role is unknown is particularly useful from an interop standpoint. That doesn't prove anything except the browser's ability to concatenate strings together. :) Am I missing anything?

The intention is not about unknown/unimplemented roles. It's about interop for known browser internal roles that don't map to a particular concrete role, either because they can't or haven't yet. For example: the HTML <popup> proposal should return generic (not html-popup) until that element is implemented by the engine.

jcsteh commented 1 year ago

All those inputs are 'textbox' or similar.

This doesn't seem right to me for <input type="date">, which is exposed as a group containing three separate spin buttons and a button in both Firefox and Chrome. Similar for <input type="time">.

The intention is not about unknown/unimplemented roles. It's about interop for known browser internal roles that don't map to a particular concrete role

I thought as much. I think that (plus your comment above) answers my question: this is a predefined list of constants, rather than string concatenation.

cookiecrook commented 1 year ago

@jcsteh wrote

<input type="date">, which is exposed as a group containing three separate spin buttons and a button

Yeah, looking back, the non-textfield input types are still "not mapped" in HTML-AAM, and not yet addressed by the new PR. The computedrole format you proposed seems reasonable to me: html-input-date… @scottaohara, @aleventhal, @benbeaudry what do you think?

scottaohara commented 1 year ago

Seems reasonable as a stop gap. I’d personally like to think about defining these a bit more, but that doesn’t need to be done now.

cookiecrook commented 1 year ago

Taking assignment to follow up with some language about this one now that @spectranaut's new computedrole section landed with https://github.com/w3c/core-aam/pull/167

cookiecrook commented 1 year ago

Actually, I'm already tracking that in #1887 (and related), so I think this one can be closed now. Please reopen if there is more to do that I missed.