w3c / aria

Accessible Rich Internet Applications (WAI-ARIA)
https://w3c.github.io/aria/
Other
654 stars 125 forks source link

Does the math role, as written, make sense? #940

Open joanmarie opened 5 years ago

joanmarie commented 5 years ago

The current definition of the math role states the following:

Content with the role math is intended to be marked up in an accessible format such as MathML [MathML3], or with another type of textual representation such as TeX or LaTeX, which can be converted to an accessible format by native browser implementations or a polyfill library.

Why is an ARIA role required for polyfills to work? Unless I'm missing something, it's not. Nor is it, IMHO, our job to provide a means for others to hack around lack of MathML support in user agents.

The rest of our roles (with the exception of presentation/none) do a couple of things:

I think the math role should do the above two things and only the above two things.

Over time, we may want to consider adding other math related roles (e.g. to achieve parity with MathML or the "refresh" in progress of MathML). HOWEVER, that work, if agreed upon, would fall into the 1.4 cycle. Plus, APA is currently working on knowledge domain accessibility. They may come up with some solution which we'll want to consider.

In the meantime, here is my proposal for the 1.2 cycle:

  1. We remove from the spec what is currently Example 6. It's a large MathML example without an reference to ARIA other than the comment 'The math element has an implicit role="math".'
  2. We replace it with some simpler example which actually uses ARIA and demonstrates how some other host language than MathML (it could be HTML, SVG, whatever) combined with the math role would make sense for authors to consider. This example should have child elements.
  3. Assuming we accomplish 2, we can keep what is currently example 7, namely an img element with the math role and some alt text with a textual representation of the equation.

Thoughts?

joanmarie commented 5 years ago

Something I discovered as commented-out text in the ARIA spec is:

Editor's note: Might need an RFC-2119 "should" requirement here to encourage AT to speak math approximations with high punctuation verbosity. Otherwise ambiguous characters like a forward slash (/) may not be spoken even when intended to be used interchangeably with the division sign character (÷)

That's strikes me as text which we might want in our to-be-revised definition of the math role.

AmeliaBR commented 5 years ago

The only way the ARIA 1 version of math (with children presentational) makes sense, to me, is as a container for an embedded object or foreign content. The content needs to be processed and presented separately from any ARIA rules. ARIA doesn't specify how that processing occurs or how the markup is translated into an accessible format. From the perspective of ARIA and the accessibility tree, it behaves something like an embedded Flash player or PDF file or something like that.

But if we're moving towards exposing MathML as a structured document, then this approach falls apart. In that case, we need a way to represent the full content in ARIA, and it may make more sense to think of role="math" as being equivalent to an HTML <code> element, establishing a context for interpreting the content (and therefore changing how punctuation might be read).

But it sounds like we're a ways away from being able to fully specify an accessible representation of MathML, so maybe it's necessary to still keep the "embedded object" interpretation for now.

For the example of an img with role="math" and a textual version of the equation as an alt, this interpretation requires a fallback accessibility behavior: If the browser doesn't have any way of parsing the content in a more advanced way, the plain text name or text content of the element is treated as the "embedded math object". That same textual fallback would also apply if the author put a math role on a span of plain text or CSS-styled HTML or SVG markup: the math role just becomes essentially a group with a special role description for whatever content would be exposed anyway.

With that approach in mind, I'd add the following clarifications:

This should handle the three of the most common cases as best as possible:

Polyfills for converting MathML or Latex would still be expected to generate one of these representations. If a required polyfill doesn't run, the AT experience wouldn't be any worse than the visual experience, reading out the plain text representation from the markup.

joanmarie commented 5 years ago

it may make more sense to think of role="math" as being equivalent to an HTML <code> element, establishing a context for interpreting the content (and therefore changing how punctuation might be read).

This is exactly what I'm thinking.

But it sounds like we're a ways away from being able to fully specify an accessible representation of MathML, so maybe it's necessary to still keep the "embedded object" interpretation for now.

I personally don't think the status of MathML has anything to do with ARIA. Regardless, are you aware of authors or user agents that are relying upon the ARIA math role as you describe (the "embedded object" interpretation)? Or to put it a different way: Let's say during the 1.2 cycle we make the changes I propose and make math akin to the code element in functionality. What specifically would break?

AmeliaBR commented 5 years ago

Let's say during the 1.2 cycle we make the changes I propose and make math akin to the code element in functionality. What specifically would break?

Well, MathML would break unless you specifically allow user agents to treat MathML specially.

But for any element with role="math" that isn't MathML (or theoretically, another type of math markup that the user agent has special rules for), I think we agree: treat it as a group with a math-related role description, plus an interpretation hint which should at least require ATs to preserve punctuation (but could also trigger other math-related heuristics in pronunciation or Braille display, e.g. assuming that a hyphen-minus character is a minus instead of a hyphen).

(Sorry if my original comment was unclear. I rewrote it many times, changing my conclusions as I wrote them out.)

joanmarie commented 5 years ago

Well, MathML would break unless you specifically allow user agents to treat MathML specially.

How would MathML break? MathML doesn't depend on there being an ARIA math role. In fact, if the ARIA math role didn't exist, MathML would still render (or not). And it would be as accessible as it currently is.

AmeliaBR commented 5 years ago

In fact, if the ARIA math role didn't exist, MathML would still render (or not).

So, you're suggesting that (for now, anyway), <math> would not map to the math role, but would have its own host-language mapping that is independent of the ARIA role? That way, the ARIA role wouldn't have to handle the split between the two types of math content — it would be handled by the element role mappings?

That could probably work, so long as there was a note that browsers that do support MathML should not support any ARIA role on <math>, so that someone doesn't accidentally break the MathML accessibility if, as part of their fallback/polyfill strategy, they are using an explicit <math role="math">.

joanmarie commented 5 years ago

In fact, if the ARIA math role didn't exist, MathML would still render (or not).

So, you're suggesting that (for now, anyway), <math> would not map to the math role, but would have its own host-language mapping that is independent of the ARIA role?

I am most definitely not suggesting that.

Look at the mappings right now in the Core-AAM for math in the current Rec. Here's what both a math element and the math role currently cause to happen:

In other words, the mappings for both the HTML element and the ARIA role boil down to "It's something mathy, ATs. Now you figure out what to do with it."

So ATs already have to look at the contents and decide if it's native MathML, or an image with some alt text (as per the existing ARIA spec example), or some divs and spans CSSed into looking like math, or an SVG, or....

So if we leave HTML-AAM as it is, and Core-AAM as it is, and make the ARIA math role work more like the code element, I'm afraid I have to ask again: What breaks?

That could probably work, so long as there was a note that browsers that do support MathML should not support any ARIA role on <math>, so that someone doesn't accidentally break the MathML accessibility if, as part of their fallback/polyfill strategy, they are using an explicit <math role="math">.

I think that duplicating the role like that is already frowned upon in some spec somewhere. I forget where now. And an author probably could come along and do something like <math role="img"> and totally break the accessibility of the native MathML. But what authors can and cannot to with respect to applying ARIA to native host languages, I believe, typically falls under the jurisdiction of those native host languages.

AmeliaBR commented 5 years ago

@joanmarie Ok, let's see if I can update my understanding again:

The proposal is to treat the element with a math role as a container and expose whatever is inside it according to the normal rules for the markup language.

For MathML, a <math> element is a container, "whatever is inside it" is the markup that defines the actual equation. So the math role still applies to the outer container. ARIA wouldn't define or change how that inner MathML content is exposed.

AmeliaBR commented 5 years ago

Going over some of @cookiecrook's concerns from #425, I see one remaining issue:

Should an aria-label or aria-labelledby on an element with role="math" replace the child content (like it would for role="button" or other children-presentation elements)?

For backwards compatibility with the ARIA 1 definition it should. But that's not consistent with other "container" roles like group or figure. And browsers are inconsistent today — whichever way you spec it, there are two implementations.

Demo using an SVG equation where the text content doesn't make sense without the drawing, but an aria-label provides a plain-text equivalent. James gave a similar example in https://github.com/w3c/aria/issues/425#issuecomment-280259927, where the displayed equation used a pre-formatted ASCII rendering and the aria-label gives a linearized version. With the ARIA 1 definition, the child markup is presentational so it should be ignored anyway — only the accessible name is used. Switching to a "container" model, like figure or group, would mean that the accessible name is used as an extra label in addition to the child content. So that would be a breaking change, exposing the presentational markup as text content.

But, it's already broken today in Chrome and Firefox. They both expose the text nodes in the SVG in addition to the label on the container. In contrast, EdgeHTML and (I presume) WebKit follow the ARIA 1 spec and treat the child content as presentational.

The demo also includes an img as a heading, because I wanted to confirm that all browsers correctly handle a container role applied to an img with alt attribute. They do, exposing the image alternative text as the heading text. So, treating a math as a container role shouldn't change the accessibility of equations represented as image elements with plain text alternatives in the alt.

The final test in that demo is a div with role="math" and the equation (y = x squared) represented as HTML content. I can't test WebKit, but EdgeHTML doesn't expose any of the content, so it just becomes an unnamed, empty math equation. Which is broken and against author expectations; changing to a container model would at least expose it as "y=x2" (and would match Chromium and Firefox). Ideally, of course, the superscript nature of the "2" would also be exposed, and an assistive tool could use the parent math role as a hint on how to interpret and expose it.

So, in conclusion: I think switching to a simple container model for math would be a net improvement, although it's a breaking change for IE/Edge and WebKit. The worst case is that it exposes garbage presentational markup in addition to the text representation intended by the author. The best case is that it actual exposes the content the author intended.

joanmarie commented 5 years ago

@joanmarie Ok, let's see if I can update my understanding again:

The proposal is to treat the element with a math role as a container and expose whatever is inside it according to the normal rules for the markup language.

Yup! Apologies for not being clearer.

For MathML, a <math> element is a container, "whatever is inside it" is the markup that defines the actual equation. So the math role still applies to the outer container. ARIA wouldn't define or change how that inner MathML content is exposed.

Exactly.

joanmarie commented 5 years ago

Should an aria-label or aria-labelledby on an element with role="math" replace the child content (like it would for role="button" or other children-presentation elements)?

In my opinion no. And if the group agrees to make children-presentational=false for the math role, I believe it would obsolete this question (right?).

joanmarie commented 5 years ago

For now I'm going to assume that we have general consensus regarding moving the math role to a function serving as a container of mathematical content. I think one thing we'll need to define is what all constitutes valid content/use cases.

My thoughts: The purpose of the math role will be to display "mathematical expressions". By doing so, ATs could benefit from this role by:

Is there any reason why we shouldn't limit our revised math role to mathematical expressions?

AmeliaBR commented 5 years ago

Is there any reason why we shouldn't limit our revised math role to mathematical expressions?

I'm assuming that as part of HTML role parity, ARIA 1.2 will also have a dedicated "code" role? (as tracked in #874) If so, I think the "math" role can & should continue to be defined specifically for mathematical expressions.

Not treating the text as if it were literary

This advice might need to be worried carefully, to support existing content that uses aria-label or alt with textual translations of an equation, like "y equals x squared" (while also correctly reading something like y=ab+c). But at a certain point this comes down to hints and heuristics, not testable requirements.

AmeliaBR commented 5 years ago

(Cross-posting from https://github.com/w3c/aria/issues/425#issuecomment-482177610)

It may be worth having an example of using aria-hidden on presentational markup within an equation, with a warning about the change in definition from ARIA 1.

E.g., working from the pre-formatted example:

<div id="eq-1" role="math" aria-labelledby="eq-1-linear">
<p id="eq-1-linear" hidden>x=⟮−b±√⟮b²−4ac⟯⟯÷2a</p>
<pre aria-hidden="true">
      −b±√⟮b²−4ac⟯
x =  -------------
          2a
</pre>
</div>
joanmarie commented 5 years ago

Something else that occurs to me: Do we want to limit the mathematical expressions to static content? I'm thinking we do.

MathJax's built-in explorer uses the application role, which I think is a good thing: It tells screen readers that they should NOT control the user interaction. This hint would be lost if the math role were applied instead. In addition, when using the explorer mode, MathJax is providing the text to speak, thus we wouldn't want screen readers changing their punctuation verbosity settings.

So the revised math role would be for mathematical expressions in which the assistive technology is expected to provide the user interaction (i.e. keyboard navigation) and presentation (speech and/or braille, in the case of a screen reader). Authors SHOULD NOT use the math role when they are managing focus.

joanmarie commented 5 years ago

Something that was raised during the face-to-face is the creation of a property like aria-latex which would contain the LaTeX representation of the expression. This would be supported right now on the math role.