w3c / accname

Accessible Name and Description Computation
https://w3c.github.io/accname/
60 stars 23 forks source link

Interior whitespace questions in this AccName text node test #208

Closed cookiecrook closed 9 months ago

cookiecrook commented 9 months ago

Hi AccName editors and ARIA WG, I need feedback on the interior whitespace normalization questions in this WPT AccName text node test:

https://github.com/web-platform-tests/wpt/pull/42407

cookiecrook commented 9 months ago

For example:

The main questions are:

cookiecrook commented 9 months ago

@MelSumner @accdc @jnurthen @spectranaut @jcsteh @aleventhal

jcsteh commented 9 months ago
* `<button>button&nbsp;label</button>` returns a `computedlabel` value that includes a unicode non-breaking space, not regular space.

To make matters more interesting, both Gecko and Chromium compress regular space to a single space, but they don't compress non-breaking spaces.

* Likewise `<button>button  label</button>` contains multiple interior space chars ~`"button   label"`

It's worth noting that in this case, the label comes from the text node, and the text node will have its space compressed during layout rendering (so it's visually compressed). If the browser's a11y engine uses the "rendered" text, it will get "button label", even if the a11y engine doesn't compress spaces itself. In contrast, this won't get compressed during rendering:

<pre><button>button label</button></pre>

The net result is the same in Gecko and Chromium because they both seem to compress acc names before returning them.

* Is this interior whitespace (sometimes different between engines) something we care about in AccName implementations? IOW, can we think of scenarios where those interop diffs might break some real user experience?

I can't think of any. Spacing does matter a lot for text interfaces, since they might be used for editing or consumption of formatted text, but I don't think it matters for names/descriptions, which are already limited by being purely plain text.

* Should I normalize interior whitespace (greedy whitespace regex replaced with a single space) to get these WPT tests to pass? Of note, the WPT accessibility label helper methods already do this with leading/trailing whitespace.

This seems reasonable. In an ideal world, we'd be perfectly interoperable. Pragmatically, noodling over this doesn't feel like a great use of time - far bigger fish to fry and all - unless there's real user harm that I'm missing.

jcsteh commented 9 months ago

I guess uncompressed white space could annoy braille display users, especially those with smaller displays. In that case, maybe engines should compress all white space, including nbsp. That would probably require converting nbsp to regular space, though.

Could uncompressed space cause problems for speech recognition apps searching for a control? They already have to do partial string searches, so it's reasonable to think they might already handle/normalise white space weirdness themselves.

aleventhal commented 9 months ago

I'm not sure what unintended consequences there might be if we start compressing &nbsp; like other space. Certain apps may use it, e.g. Google Docs/Sheets. Although I suppose we could decide to not compress in an editable area.

MelSumner commented 9 months ago

I'm not sure what unintended consequences there might be if we start compressing   like other space.

This is what I'm thinking about too. I would think about the backwards-compatibility of this, perhaps.

cookiecrook commented 9 months ago

Thanks all. Closing this issue and I’ll work on some white space normalization helper methods for the WPT tests.

accdc commented 9 months ago

Apologies I'm coming in late on this, been out sick for a while.

"Should I normalize interior whitespace (greedy whitespace regex replaced with a single space) to get these WPT tests to pass?"

I recommend this for AccName, mainly because I remember when Chrome didn't used to do this and it caused all sorts of weird issues when trying to read individual controls like buttons and the like with a screen reader.

E.G. At the time, when newline chars were present for example within a button, it appeared as though there were multiple buttons when arrowing down the page with JAWS because for every announcement of the text on a different line the role was appended, making it sound like one button was weirdly split into many.

In another example, due to source code formatting, another label was prepended by 20 tab chars all of which appeared in the accessible name, which was problematic for word and character navigation. I can see the same issue possibly with the nonbreaking space char as well in some edge cases.

I recommend keeping it simple and simply reduce whitespace to be as sensible as possible.

cookiecrook commented 9 months ago

Sounds like you're describing a real implementation problem in the browser or screen reader. If something still exists like that, please write up a new AccName "test needed" issue and I'll work on getting a WPT test written so we can at least rule out the implementations as the root of the problem.

The consensus in this issue was to normalize the testing in most cases (where the extra space doesn't matter)… It won't change any implementations... It will just not throw a FAIL message if one of the engines has more than one space between the two words in "button label" for example.

cookiecrook commented 9 months ago

PR is ready for review: https://github.com/web-platform-tests/wpt/pull/42407

accdc commented 9 months ago

Thanks, sounds good to me.

Thankfully the weird cases I mentioned were many years ago and were fixed back then so I haven't seen any lately. If it comes up though I'll work on a test case for it.

cookiecrook commented 9 months ago

Also of note is that @zcorpan reminded me the ARIA WG settled on HTML's definition of ASCII Whitespace, which does NOT include non-breaking space. So the PR is now updated to match.