Closed cookiecrook closed 1 year ago
At a later date, when it'd be possible to support a fully functioning video
role, the user agents could expose it instead.
@joanmarie have you considered role parity for the HTML-embedded host languages like MathML and SVG? In theory, there could be role parity on those as well, but I'm not sure why anyone would want to build a semantic copy of MathML or SVG and then have to render it themselves.
In theory, ARIA could have role parity with MathML and then WebKit+VoiceOver would still be able to expose Nemeth Braille to the user in some MathML-like ARIA structure, though this seems like a lot of work for very little gain. It'd be easier to just map those to reserved roles like native-mfrac
or host-mfrac
that should not be used explicitly by authors.
I think I miss the whole idea of 'native-' roles. How 'native' prefix will help to make video/audio controls accessible later, and why ARIA can't reserve 'video'/'audio' role instead.
@cookiecrook re MathML parity, see w3c/aria#660. I'll leave it to @AmeliaBR to comment.
Having said that, the ARIA Working Group committed to (read: "promised other groups within the W3C") that we would achieve role parity with HTML for ARIA 1.2 and do so in a timely fashion. And as you may be aware, some folks within the W3C are paying much closer attention to achieving stated milestones. Which brings me to the following:
Working on MathML and SVG and
@asurkov among other uses, this could be used to return a standardized computed role for the WebDriver proposal element.computedRole. Even in cases where role parity was not achieved, browser implementations could agree on the returned role value, for the sake of testing the web forward.
Copying @alice in because she asked about this one. She also may have suggested an interesting name-spaced syntax I had not previously considered.
videoElement.computedRole; // "html:video"
mfracElement.computedRole; // "math:mfrac"
For SVG, the ARIA graphics roles cover basic role parity with native features. The only roles we are currently special-casing in the SVG-AAM are text semantics, equivalent to <p>
and named/interactive spans.
In the more general case:
I'm inclined to agree with @asurkov: why not just make these regular roles? A video is a common compound widget that is used all over the web. It is an implementation detail that the end user shouldn't need to deal with whether it is created by a single HTML <video controls>
element (and the resulting shadow tree), or by a <video>
grouped with author-supplied controls, or by a <canvas>
or <svg>
animation.
The ARIA Working Group just discussed audio/video role parity
.
I still think we don't want to do this. Unfortunately, we don't seem to have consensus on this yet. So pushing back to 1.3 for now.
Of note, WebDriver is progressing with adding computedRole in Issue 1439 and Pull 1444, so if we're able to reach consensus on this, we should be able to start testing implementations against each other soon.
Clarifying: In either case, we should be able to test implementations for any defined ARIA role and any native element matching one of the parity roles defined in ARIA 1.2. This issue only prevents us from comparative testing of roles where WG consensus has yet to be reached, particularly in native elements like <video>
I think the role prefix (native or host) is not a good idea. At least for video and audio there should be the corresponding ARIA roles video and audio (without prefix).
IMO, it would be inappropriate to add audio
and video
roles to ARIA until such a time when the respective native media interfaces can also be mimicked with ARIA or another API. For example, the native play/pause or autoplay behaviors cannot be made accessible with ARIA. Likewise, auto-selection of caption tracks or audio description tracks does not work with ARIA-driven methods. As such, it may mislead web developers into thinking that ARIA video and audio could be made as accessible as native audio and video. We've had similar problems with ARIA sliders and scrollbars for a decade; I don't believe we should pile on until the first problem is solved.
The native-*
prefix proposal is an attempt at a future-compatible pattern for testability and interoperability between implementations.
Hi @cookiecrook
I don't really understand your concerns.
role=video
, the operation of the video is no better. But with the correct ARIA role at least the purpose of the element is correctly perceptible - and that is valuable enough@JAWS-test wrote:
Why should the operation of the HTML video element not be able to be reproduced by JS? On the PC I control the HTML video element e.g. with the space bar and the arrow keys.
As an example, the <video>
element provides support for:
None of this can be "faked" by a web site using JavaScript, because there is no web API defined for such things.
@JAWS-test wrote:
In my opinion, ARIA is there to transmit role and status of elements for assistive technology. A video is a video, whether or not it is a HTML video element.
IMO, conveying a role to a user is both a technical and social contract. We are effectively telling the user, ~"this is the type of control, and therefore you know how to operate it." With most simple controls (like buttons or checkboxes), web author implementation is trivial to fulfill that social contract and meet the user's expectation.
However, no web author is able to fulfill this obligation with even slightly more complex controls, like slider. To this day, there is no way to make an ARIA slider accessible using touch screen controls. Conveying the role leaves the user confused as to why it's not working, and leaves the author confused as to how to attempt making it work. Once you get to very complex controls like tree grid, expectations are even more difficult to fulfill.
My opinion is that for very complex controls like video, we should not compound this problem by telling authors it's "accessible" to use the video role on a custom element that cannot support expectations of video accessibility. Those roles should be reserved until the time when implementations and web standards provide the ability for web authors to make them accessible.
But I think we digress. This issue is not about whether a specific role should exist. It's a pattern proposal that would allow browser implementations to test against each other. Implementation testability leads to implementation consistency, and a path for role parity testability is what this proposal is really about.
I don't think we disgress: if ARIA had the role video
, we wouldn't need a role native-video
.
If I understand you correctly, some ARIA roles should not exist (slider, treegrid etc.) because they are not operable on touch devices. I also see this as problematic. However, I rather think that here the manufacturers of assistive technology have a duty to implement the support. I.e. when I navigate to a slider with my screen reader on a mobile device, VoiceOver, TalkBack etc. must offer me a wiping gesture, which is internally converted into the key event handler for the arrow keys, because the arrow keys are used to operate the slider. As long as this has not been done, the ARIA APG could offer patterns with support for touch devices (e.g. with the slider, by displaying two buttons to increase and decrease the value or an input field to enter the value)
For videos, I don't see the lack of support by assistive technology as a problem, because the non-native video players can emulate all these functions (e.g. via buttons). This is already implemented correctly in many video players.
@JAWS-test wrote:
I don't think we disgress: if ARIA had the role
video
, we wouldn't need a rolenative-video
.
There will always be some element that does not yet have an equivalent ARIA role yet would benefit from automated comparative testing across browser implementations. The issue proposal is not specific to <video>
. It was just used as an example.
If I understand you correctly, some ARIA roles should not exist (slider, treegrid etc.)
I am not suggesting we remove the existing roles, but by today's standards, we should not add to the list until we can prove implementability.
I also see [support of some mobile interfaces] as problematic. However, I rather think that here the manufacturers of assistive technology have a duty to implement the support. I.e. when I navigate to a slider with my screen reader on a mobile device, VoiceOver, TalkBack etc. must offer me a swiping gesture, which is internally converted into the key event handler for the arrow keys, because the arrow keys are used to operate the slider.
No spec defines this, so there is nothing to implement. The proposal you've listed here (AT simulating key events) has been opposed a number of times.
The prefix-roles could possibly help to end the current wrong output in AT (see for example https://github.com/FreedomScientific/VFO-standards-support/issues/357, https://github.com/nvaccess/nvda/issues/10708). However, I am still in favor of finding roles for native and non-native elements.
Based on a call comment by @jnurthen, I've clarified the title of this issue. IMO, the issue includes but is not limited to media elements like audio
/video
.
should we move this to 1.4? the embed/object one was slated for 1.4, and i didn't get the impression we're itching to work on this for 1.3?
should we move this to 1.4?
This issue was already moved to the 1.4 milestone.
Thanks @pkra i must have misread something to think it was still 1.3 like the other.
Now that we're building out computedrole
tests in WPT, this may block portions of HTML-AAM testing. Raising it again as an agenda topic related to w3c/aria#1879, w3c/aria#1887, and w3c/core-aam#166.
From @jnurthen or @scottaohara in the ARIA meeting this morning: ~"@spectranaut is working on a PR for Core-AAM, so presumably HTML-AAM could define a similar column for computedrole, and define these in those extension specs"
For extension specs like graphics-aria, that could be graphics-image
or for reserved elements in a host language AAM like HTML-AAM, that could be html-video
until if/when there was the ability to define a core ARIA role for it.
Chromium is already returning non-standard computed roles, so a decision around interop would be useful.
<pre>
<header>
outside a direct <body>
context
Yeah I was expecting to see "HeaderAsNonLandmark" come up -- this is where it becomes clear that in testing the "computed role", we are testing is an implementation detail.
Ultimately, what is important is whether chrome/safari/firefox expose the correct thing in the platform accessibility API, and the "computed role" is just an intermediary step. Browser might actually want/need different values for "computer role", in some cases like this.
Header is expected to be exposed differently in the UIA and ATK/ATSPI in two contexts, but the same in both context for IA2 and AXAPI:
Since WebKit only exposes AXAPI, they only need the computed role "header" in both cases. But Chrome uses the computed role concepts "header" and "headerAsNonLandmark" to get to the correct mappings for all they APIs they support
What exactly would be a proposed solution to this? IMO it's not testable, unless WebKit wants to add extra logic to calculate something they don't need, or Chrome is going to... majorly refactor their code or....?
FWIW, I think it’s a solvable in WebKit to not expose the header role outside body context. I just think that scenario was overlooked.
I also think it’s solveable that Chromium could hang on to the internal role ENUM they need for platform mappings while still exposing a standard role name to computedrole
I'm currently implementing standardised role names for computedRole in Gecko. Like Chrome, Gecko has roles which don't exist in ARIA. My current implementation standardises these where possible, returning the empty string when there's no reasonable mapping. For example, <pre>
and <header>
when not a direct body child both return "generic" (since they're effectively generic text containers), but <iframe>
returns "" (since ARIA doesn't have a concept which fits iframe).
<audio controls>
and <video controls>
are an interesting case. In Gecko, they currently return "group" because our controls are contained within a group. Even though that's a standard role, the controls are going to be implemented differently in different browsers.
Consensus from the ARIA F2F last week was to use "html-[tagname]" (html-video
) in the case of HTML-AAM ambiguous or reserved host-language elements. There needs to be something in the ARIA spec defining that authors should not use host-language prefix+tagname roles (html-video, svg-polygon, etc) in content, unless a relevant AAM defines those as concrete roles. The same note should probably mention dpub-chapter is okay in content because 1) DPUB is not a host language, and 2) those are defined as concrete roles in DPUB-AAM.
Do we plan to have a defined list of these html-*, etc. roles? It doesn't seem to me like just prefixing the tag name with "html-" when the role is unknown is particularly useful from an interop standpoint. That doesn't prove anything except the browser's ability to concatenate strings together. :) Am I missing anything?
What should we do for things like <input type="date">
? html-input isn't quite specific enough there. html-input-date?
@jcsteh It would not be that many. All those inputs are 'textbox' or similar. From @scottaohara's HTML-AAM PR, the list is: audio, canvas, cite, embed, iframe, kbd, label, legend, map, object, summary, var, video, and wbr
@jcsteh wrote:
It doesn't seem to me like just prefixing the tag name with "html-" when the role is unknown is particularly useful from an interop standpoint. That doesn't prove anything except the browser's ability to concatenate strings together. :) Am I missing anything?
The intention is not about unknown/unimplemented roles. It's about interop for known browser internal roles that don't map to a particular concrete role, either because they can't or haven't yet. For example: the HTML <popup>
proposal should return generic
(not html-popup
) until that element is implemented by the engine.
All those inputs are 'textbox' or similar.
This doesn't seem right to me for <input type="date">
, which is exposed as a group containing three separate spin buttons and a button in both Firefox and Chrome. Similar for <input type="time">
.
The intention is not about unknown/unimplemented roles. It's about interop for known browser internal roles that don't map to a particular concrete role
I thought as much. I think that (plus your comment above) answers my question: this is a predefined list of constants, rather than string concatenation.
@jcsteh wrote
<input type="date">
, which is exposed as a group containing three separate spin buttons and a button
Yeah, looking back, the non-textfield input types are still "not mapped" in HTML-AAM, and not yet addressed by the new PR. The computedrole
format you proposed seems reasonable to me: html-input-date
… @scottaohara, @aleventhal, @benbeaudry what do you think?
Seems reasonable as a stop gap. I’d personally like to think about defining these a bit more, but that doesn’t need to be done now.
Taking assignment to follow up with some language about this one now that @spectranaut's new computedrole
section landed with https://github.com/w3c/core-aam/pull/167
Actually, I'm already tracking that in #1887 (and related), so I think this one can be closed now. Please reopen if there is more to do that I missed.
Consider a "native-" or "host-" role prefix for elements like audio/video that would require complex API support out of scope for ARIA. Authors SHOULD NOT use these in content, but test tools like the WebKit inspector, could return these values where no specific ARIA role matched.
<button>
->button
(concrete)<input type="range">
->slider
(concrete)<video>
->native-video
,host-video
,[Update: Another option from comment below.]
[Update March 2023: or
html-video
] (like abstract roles, authors should not use inrole
attr)