act-rules / act-rules.github.io

Accessibility conformance testing rules for HTML
https://act-rules.github.io/
Other
136 stars 69 forks source link

HTML page lang and xml:lang match - problems with assumptions and applicability [5b7ae0] #1921

Closed dd8 closed 1 year ago

dd8 commented 2 years ago

This is a follow on issue for https://github.com/act-rules/act-rules.github.io/issues/1172

We've done a lot of testing round lang and xml:lang and there seem to be problems with both the rule assumptions and applicability.

Historically there was a lot of spec churn over use of lang vs xml:lang which must have caused lots of implementation inconsistencies, but this was resolved in HTML 5 in 2014 and implementations are now very consistent. The inconsistencies between Chromium and other browsers for pages served as application/xhtml+xml was resolved in Chrome 88, and all browsers now behave identically. Notably Chrome 88 was released in January 2021 after this rule was created.

I think the key piece of text in the rule is:

Since most assistive technologies will consistently use lang over xml:lang when both are used, violation of this rule may not necessarily be a violation of WCAG 2. Only when there are inconsistencies between assistive technologies as to which attribute is used to determine the language does this lead to a violation of SC 3.1.1.

Rule problems

1) We weren't able to find any inconsistencies in AT voicing for pages served as text/html, but did find inconsistencies in pages served as application/xhtml+xml documents. Unfortunately the rule only applies to text/html so I think it's never detecting true positives, and only flagging false positives.

2) The rule assumes that fr, fr-FR and fr-CA are interchangeable and all voice as French. This isn't true for NVDA with the default OneCore synthesiser (see below).

text/html documents

There are no inconsistencies in assistive technologies for documents served as text/html. Test results here: https://www.powermapper.com/tests/screen-readers/content/html-page-lang-with-xml-lang/

application/xhtml+xml documents

Test results here: https://www.powermapper.com/tests/screen-readers/content/xhtml-page-lang-with-xml-lang/

This means the language used for voicing may not match the CSS language leading to the wrong voice being used for content displayed using the :lang() selector.

This example displays 'C'est le français' and voices it correctly as French when served as text/html but displays 'Das ist deutsch' voiced as French when the page is served as application/xhtml+xml

        <html xmlns="http://www.w3.org/1999/xhtml" lang="fr" xml:lang="de">
        <head> 
            <title>Test for mismatching lang and xml:lang</title>
            <meta charset="utf-8"/>
            <link rel="stylesheet" href="SR-content-lang.css"/>
            <style>
                div:lang(fr)::before { content: "Un, deux, trois"; } 
                div:lang(de)::before { content: "Eins, zwei, drei"; } 
            </style>
        </head>
        <body>
            <h1 lang="en">Following elements inherit page language - hover to view CSS :lang()</h1>

            <p>garage</p>
            <p>double</p>
            <p>dame</p>
            <div></div>
        </body>
        </html>

Subtag matching

Most screen readers will voice lang=fr, lang=fr-FR and lang=fr-CA as French if a French voice are installed. The exception is NVDA using the default OneCore speech synthesiser which voices lang=fr, lang=fr-FR if the French (France) language pack is installed, but voices lang=fr-CA as English unless the French (Canada) language pack is installed. The same thing happens with lang=de, lang=de-DE and lang=de-AT and the language packs for German (Germany) and German (Austria)

Test results here: https://www.powermapper.com/tests/screen-readers/content/html-lang-subtags/

If you have a French (France) language pack language pack installed you can hear this happening in NVDA on https://www.canada.ca/fr.html if you use dev tools to change lang=fr to lang=fr-CA

The problem doesn't happen with the legacy NVDA eSpeak synthesiser which maps all fr- subtags to the same robotic French voice.

Jym77 commented 2 years ago

I've been crunching some numbers on our customer data. For us:

Given the low number of failures this catch in real life, and the problems the rule seems to be consistently causing, I'm in favour of just deprecating it and stop spending any efforts on it…

I'm curious to see if other tool vendors are having similar numbers of greatly different ones 🤔 But if we all agree that the rule catches a problem on a few hundredth of a percent of all pages, I think it is not worth a lot of efforts to maintain…

dd8 commented 2 years ago

I'd agree with deprecation. I'm assuming the numbers above apply to text/html (where AT behaviour is now consistent).

If the applicability was changed to application/xhtml+xml where there are still inconsistencies, then the number of failures would be tiny because application/xhtml+xml only accounts for 0.05% of all page loads (and only a small number of those would have mismatching lang/xml:lang) https://commoncrawl.github.io/cc-crawl-statistics/plots/mimetypes

As a general point, rules that detect specific AT behaviours (e.g. inconsistencies between implementations) need constant re-testing because AT behaviours change over time (bugs are fixed, clarifications are added to specs, etc).

Edit:: This also poses problems for rule stability / reliability over time. For rules that detect AT behaviour:

Jym77 commented 2 years ago

Edit:: This also poses problems for rule stability / reliability over time. For rules that detect AT behaviour:

  • either rule change as AT changes (so the rule isn't stable) or
  • rule stays stable but starts producing false positives

We've been touching this during the last CG call. In short, the ACT rules TF reviews rules on a yearly basis and is making sure that they stay up-to-date with technologies. It probably makes their job easier if we (=rules writers) clearly mark the inconsistencies we found in the Accessibility Support section.

dd8 commented 2 years ago

Definitely agree that the inconsistencies should be documented, otherwise you get into the same problem that happened when trying to deprecate WCAG 4.1.1 https://github.com/w3c/wcag/issues/770

There was no documentation on which AT was affected by 4.1.1, so it's very hard to tell whether it still does anything useful 20 years later

carlosapaduarte commented 2 years ago

Resolution from CG meeting: deprecate this rule