AG WG Review: 4.1.1 Parsing WCAG2ICT guidance

maryjom commented 9 months ago

Summary

Review of proposed WCAG2ICT guidance for 4.1.1 Parsing, as incorporated into the Editor's draft. This is a rewrite of the content that can be found in the 2013 WCAG2ICT document's guidance for 4.1.1 Parsing, as WCAG 2.2 made this obsolete and removed it. WCAG 2.0 and 2.1 made it always "satisfied" for XML and HTML. The TF has taken a similar approach for non-web documents and software in our updated section on 4.1.1 Parsing.

Maturity Level

Refining

Action Needed

Review content
Use the thumbs up and thumbs down icons to indicate agreement or disagreement with various comments. Do not add comments to simply support other comments.
Note any suggested edits for improvement and/or reasoning either in a comment on this issue or by creating a PR on the WCAG2ICT document. All SCs can be found in the comments-by-guideline-and-success-criteria.md file.

Background information

The WCAG2ICT Task Force concluded that 4.1.1 Parsing needs to include guidance for all WCAG 2 versions, as there is no other Working Group Note that covers WCAG 2.1 and this document will replace the 2013 Note. Such guidance is needed, as worldwide policies in the US and abroad will likely be using any of the three WCAG versions (2.0, 2.1, or 2.2) for some time. This means guidance is needed for all three. The only SC that is different is 4.1.1 Parsing, as you'll note in the content the TF approved for the Editor's draft. The title of the Note will be changed to "Guidance on Applying WCAG 2 to Non-Web Information and Communications Technology" and other appropriate edits will soon be made to the document to reflect this change.

Content for Review

Section Applying SC 4.1.1 Parsing to Non-Web Documents and Software.

Reference Material

In case you need them, links to various resources on 4.1.1 Parsing

WCAG 2.0 and 2.1 errata for Parsing can be found in the WCAG 2.0 editorial errata #13 and the WCAG 2.1 editorial errata - last bullet
WCAG 2.2 4.1.1 Parsing

mbgower commented 9 months ago

The only consideration I thought I'd flag is that your addition uses "success criterion" lower case, while the existing (not always consistently applied) style for WCAG is first letter capitalized ("Success Criterion"). If you are consistent with how you style this in all WCAG2ICT content, that is probably acceptable, but it's worth validating.

maryjom commented 9 months ago

@mbgower It was unclear from looking at WCAG if there was still a style being applied where "Success Criterion" was being capitalized, as it it not in a number of places in WCAG 2.2 and its supporting documents. Will confirm with AG WG chairs one way or the other. I already have an editor's task in Issue #255 with this as one of the listed items to check and potentially fix.

The thumbs up would go on the very first comment on this issue at the bottom. See Phil's and you simply click on that to add yours.

dbjorge commented 8 months ago

I 👎 ed based on the preamble sections of the document being a bit ambiguous about whether the intent of the document is to cover "strictly the current state as of WCAG 2.2" vs "some combination of the requirements from WCAG 2.1 and WCAG 2.2". The title suggests the former, but then the "Status of the Document" opens with:

This is a Technical Report on Applying WCAG 2.2 to Non-Web Information and Communications Technologies (WCAG2ICT). The intent of this work is to update the existing guidance based on new WCAG 2.1 and 2.2 success criteria.

If the intent is only to cover 2.2, I think this proposal is reasonable (but the language in the preamble needs to be clarified).

If the intent of the document is to cover both how to apply WCAG 2.1 and WCAG 2.2, I don't think it's reasonable to omit discussion of 4.1.1 completely; the language by which we de-facto dropped the requirement in WCAG 2.0/2.1 for HTML content was intentionally very specific to HTML content, and likely wouldn't be reusable as a means to drop support generally for all non-web technologies. You'd need to assess the implementation technology to understand whether the same exception logic applies, I think.

maryjom commented 8 months ago

I 👎 ed based on the preamble sections of the document being a bit ambiguous about whether the intent of the document is to cover "strictly the current state as of WCAG 2.2" vs "some combination of the requirements from WCAG 2.1 and WCAG 2.2". The title suggests the former, but then the "Status of the Document" opens with:

This is a Technical Report on Applying WCAG 2.2 to Non-Web Information and Communications Technologies (WCAG2ICT). The intent of this work is to update the existing guidance based on new WCAG 2.1 and 2.2 success criteria.

If the intent is only to cover 2.2, I think this proposal is reasonable (but the language in the preamble needs to be clarified).

If the intent of the document is to cover both how to apply WCAG 2.1 and WCAG 2.2, I don't think it's reasonable to omit discussion of 4.1.1 completely; the language by which we de-facto dropped the requirement in WCAG 2.0/2.1 for HTML content was intentionally very specific to HTML content, and likely wouldn't be reusable as a means to drop support generally for all non-web technologies. You'd need to assess the implementation technology to understand whether the same exception logic applies, I think.

We have in progress a PR #264 (not yet merged into the document, as I need the time to do a good review before merging). This WCAG2ICT note will be renamed and clearly stated that it applies to all versions of WCAG 2. See Issue #260. So @JAWS-test and @dbjorge are you still a thumbs down since that will be merged in soon? This won't go out for publication again until these tasks are completed.

maryjom commented 8 months ago

@dbjorge and @JAWS-test there was already a long discussion about your concern about 4.1.1 Parsing not applying to non-web technologies in the Task Force work done to analyze and propose the content for this SC. See Issue #241 comment thread. We know of no non-web technologies that are developed in markup language where the AT parses through the markup for the accessibility information. I welcome any concrete examples of this. IMO, I've never in my experience needed to fail something for 4.1.1 Parsing that didn't already fail some other SC where an accessibility error can be reported. If you have examples of that, please provide us with them.

dbjorge commented 8 months ago

are you still a thumbs down since that will be merged in soon?

It sounds like the intent is for the document to include the coverage of how it applies to 2.0/2.1, so yes, I am still 👎 for the WCAG 2.0/2.1 guidance specifically.

there was already a long discussion about your concern about 4.1.1 Parsing not applying to non-web technologies in the Task Force work done to analyze and propose the content for this SC. See Issue https://github.com/w3c/wcag2ict/issues/241 comment thread.

Thanks for the pointer! I've read through this thread, but it looks like much of the discussion there was from before the decision was taken in #260 to include 2.0/2.1 guidance. Fundamentally, I'm repeating and agreeing with the point that @JAWS-test made in https://github.com/w3c/wcag2ict/issues/241#issuecomment-1771570960 - it looks like the responses to that comment at the time were based on expecting that 2.0/2.1 guidance would not appear in the final document, but it appears that that's changed, which leaves that comment still-unaddressed. Particularly, I do not think this concern has been addressed by either the current proposed text or https://github.com/w3c/wcag2ict/issues/241#issuecomment-1791209698.

I think @JAWS-test already worded my concern very well:

If I understand correctly, the W3C has decided to remove 4.1.1,

because the HTML specification now clearly states how syntax errors are to be corrected. because the AT does not analyse the source code, but accesses the API. As soon as one of the two conditions does not apply, 4.1.1 would continue to be relevant.

It sounds like the proposed guidance here is suggesting that "we don't know of any non-web examples" is enough for us to say "you can ignore the requirement"; I disagree with that interpretation, I think that's going too far towards wcag2ict inventing a substantive wcag exception.

I would be more comfortable with something that replaces this note from the WCAG 2.1 errata:

NOTE This Success Criterion should be considered as always satisfied for any content using HTML or XML.

...with a non-web version along these lines:

NOTE This Success Criterion should be considered as always satisfied for any content using markup languages that meet the following criteria:

The markup language's specification unambiguously defines how user agents and platforms must handle all of the types of errors covered by 4.1.1, AND The markup language's specified error handling behavior is accessibility supported. As a concrete example, MathML is a markup language which ATs such as MathPlayer directly parse the language of; it is built on top of XML, but it implements its own concept of "id" over what XML specifies, so I don't think it would be right to consider it be "XML" for the purposes of the original web NOTE text. The MathML spec is pretty vague about how to handle some of the types of errors 4.1.1 would cover. But to be clear, I don't think discussion of any one specific example matters too much here, so I'd rather not get into the weeds of whether we think MathML specifically applies or not - my concerns are more that I don't think wcag2ict should be making an unqualified blanket exception that WCAG did not include.

maryjom commented 8 months ago

It is my understanding that MathML is another web-based technology. WCAG2ICT does not interpret anything for the web. Is that incorrect?

dbjorge commented 8 months ago

Like HTML, MathML is a technology whose primary goal is for use on the web, but which can also be used within non-web software. My understanding (based on, eg, the first note under applying 1.4.12) is that when a primarily-web language is used to implement non-web software, it does fall under WCAG2ICT's scope.

And, again:

But to be clear, I don't think discussion of any one specific example matters too much here, so I'd rather not get into the weeds of whether we think MathML specifically applies or not - my concerns are more that I don't think wcag2ict should be making an unqualified blanket exception that WCAG did not include.

GreggVan commented 8 months ago

I suggest the following to resolve this. This would be our response to 4.1.1

"In places where markup languages are used outside of the web, and where assistive technologies read the markup languages directly, then 4.1.1 would apply to software and non-web documents as written in 2.0 and 2.1. with content replaced with software and _non-webdocuments . NOTE: 4.1.1 is no longer required for web content because assistive technologies do not access the markup languages directly anymore but rather use the browsers which repair the content markup errors that caused problems for assistive technologies. "

patrickhlauke commented 8 months ago

"In places where markup languages are used outside of the web, and where assistive technologies read the markup languages directly, then 4.1.1 would apply to software and non-web documents as written in 2.0 and 2.1. with content replaced with software and non-web_documents . NOTE: 4.1.1 is no longer required for web content because assistive technologies do not access the markup languages directly anymore but rather use the browsers which repair the content markup errors that caused problems for assistive technologies. "

the vast majority of software generally just integrates a ready-made browser/rendering engines (either directly, or just relying on the platform's available webview components) like chromium or webkit, so that repair also generally happens in non-web ICT ... so the note would at least need tweaking. related, when doing an assessment of software, it will be difficult for an auditor to determine whether the software is doings its own reading/interpreting of a markup language or whether it's using one of those rendering engines, so this will likely cause issues. and assuming we're talking about markup languages that are packaged up (i.e. the software bundles them internally somehow), auditors may not even be able to determine what is and isn't markup-language-based, not have access to that markup language to validate it.

what if software uses, say, xml based configuration files or similar. are these now covered by 4.1.1? a straight reading of that first part would suggest so, but the xml isn't exposed as such to users...

GreggVan commented 8 months ago

Ah you didn't read what I wrote carefully. that is covered. the language I propose only applies where AT reads the code directly --- if the applications use browser components and a DOM - and the AT reads the DOM -- then no problem and the language still stands (and the SC is met). If they don't - then the language requires that the content be correct in order to conform

patrickhlauke commented 8 months ago

i edited my comment above to expand/clarify further, so please revisit that...

patrickhlauke commented 8 months ago

@dbjorge i know you didn't specifically intend to discuss mathml, but in the scenario you mention: what is the subject of the wcag evaluation? the mathml player software? or the mathml document? if would likely only make sense in the latter case? and are you looking at it through the lens of "it's not being delivered over the web, but somehow accessed locally, so it's not web content"?

4.1.1 really only seems to make sense when the subject of the evaluation is a file that is read by an interpreter/player, unless i'm missing something, and THEN it's only non-web ICT when not delivered via / accessed from the web directly (perhaps it's been downloaded from the web first, and then the local file has been opened. or it came on a USB stick or something?)

GreggVan commented 8 months ago

you lost me. We are either talking about software or non-web docs right? (since this is WCAG2ICT).
1) so if you are using markup in a doc - it would be whether all the readers of the doc have a DOM and that is what all the AT use. If not - then you need to follow 4.1.1 2) if you are software and use markup in your software (dialogs or help or somewhere) -- then if AT access that markup directly - you need to follow 4.1.1

make sense?

patrickhlauke commented 8 months ago

if you are software and use markup in your software (dialogs or help or somewhere) -- then if AT access that markup directly - you need to follow 4.1.1

software exposes its interface, content, etc to the platforms accessibility API/layer (think MSAA, UI Automation, etc), and AT then interface with the software through that. what scenario is there in software today where markup is used inside the software, and AT somehow access that directly, rather than accessing whatever the software first interpreted/does with it and then mediates to the accessibility layer?

maryjom commented 8 months ago

@patrickhlauke Thanks. You explained my point much better than I had. While some support trying to show how this SC would need to be applied, IMO really it doesn't need to be applied.

In WCAG, the stated benefit of 4.1.1 Parsing is:

Ensuring that Web pages have complete start and end tags and are nested according to specification helps ensure that assistive technologies can parse the content accurately and without crashing.

Any problems in a non-web document or software markup would be revealed through errors in the accessibility information produced/exposed through the software accessibility API layer. Non-web technologies don't have the problem that Web had with ATs directly parsing the markup or accessing some DOM that was created using the markup. The potential for the AT to crash due to poorly formed markup would be nonexistent as well.

Instead, non-web documents and non-web software would fail 4.1.2 Name, Role, Value or 3.1.1 Content and Info because the accessibility structure, names, and attributes wouldn't be correctly exposed through the accessibility API. IMO, it is moot whether it is because of improper ID usage, duplicate attributes, or start/end tags or the lack of correction to markup tagging by some user agent or reader software, or some intermediate layer that translates the information for the OS accessibility API. Can anyone provide evidence or examples of non-web documents or software failing 4.1.1 Parsing since the publication of the 2013 WCAG2ICT?

In our guidance, perhaps we should simply say that this SC should not be applied to non-web documents or software based on the logic above and leave it at that.

maryjom commented 8 months ago

@dbjorge @WilcoFiers @JAWS-test I'm still not quite understanding the need to apply this SC to non-web software and documents based on the two comments right before this one. (Patrick's and mine). I would really like to take this to an email conversation or a meeting to discuss this further and reach resolution to bring to the AG WG if at all possible, but I only have the email address of Wilco. If we cannot resolve via email, the topic will be brought up in the next AG WG meeting on 9 January.

dbjorge commented 8 months ago

Per meeting followup: I'd be more satisfied with Gregg's proposal than with the current text, but I don't think "does the AT parse markup language directly" is a condition that's very well-supported by the WCAG language. I think even better would be something along these lines, where the first sentence remains as-is (per the discussion clarifying that there are scenarios where non-web software might use HTML/XML), and the second sentence is combining the language from the first and second notes in the 2.0/2.1 WCAG errata, rather than creating new language:

This Success Criterion should be considered as always satisfied for any content using HTML or XML.

This Success Criterion should also be considered as always satisfied for content using any other markup language where the markup language's specifications contain specific requirements governing how user agents must handle incomplete tags, incorrect element nesting, duplicate attributes, and non-unique IDs.

mbgower commented 8 months ago

I guess my possibly naive response to everything I heard on today's call was: 'okay, let's say the AT does read the MathML directly, and it results in a user getting the wrong information. Does that not still get covered by 4.1.2 or 1.3.1 or another pertinent SC other than 4.1.1? Is there a known situation where MathML or some other technology in a non-web context is not parsed properly, does not provide correct information to the using AND cannot be failed under another criterion? Until such time as we identify such a situation, it seems to me like we're possibly over-thinking this. (As well, as questioned earlier, how many would have the chops to carry out the technical assessments to make these calls?)

JAWS-test commented 8 months ago

@mbgower 4.1.1 was never about something being interpreted incorrectly and output incorrectly by AT due to incorrect syntax - because that could always have been assessed as a violation of SC 4.1.2, 1.3.1 or others, even in the days of WCAG 2.0 16 years ago. The issue with 4.1.1 has always been that incorrect syntax that does not lead to an error for me can lead to errors for someone else that I cannot detect in the test because I do not have the other person's browser and AT. Other people may have different browsers or AT or older or newer ones - and 4.1.1 should ensure that everyone gets the same correct output regardless of browser and AT. This is now guaranteed in HTML and SVG, as all browsers repair incorrect syntax in the same way. The question to be answered is therefore not whether in other languages such errors cannot be weighted in other SCs, but whether I can reliably find the errors in my specific environment in languages other than HTML and SVG

maryjom commented 8 months ago

Apologies in advance for the length of this comment, but I went back through the meeting minutes on the discussion to draw in particular points and questions to respond to. I am not as quick at listening and responding in a meeting.

The ability to test and thus fail this SC in a non-web context is that the markup would need to be exposed in such a way that a test tool could directly access it (or its corresponding DOM) to test for correct markup. In a Web context, the DOM is open and easily available to access for testing purposes. I don't think the openness is as prevalent (if it exists at all) in a non-web context. The content is a black box that can only be tested by probing the accessibility API (if possible) or by using a screen reader. Any errors detected would fall into 1.3.1 Info and Relationships and 4.1.2 Name, Role, Value, as the tester would not know it originated from bad markup.

Windows: We heard in the meeting that Microsoft Windows does not expose the DOM created from the markup in electron or chromium-based apps, so no matter the screen reader, the programmatic information comes through the Accessibility API. In this use case, 4.1.1 would not apply (automatic pass). iOS (and maybe also for MacOS): For the example given in the meeting of an application implemented with embedded MathML, Dan indicated there is a DOM that the VoiceOver AT accesses directly. @dbjorge. Do you know if the DOM created by user agents rendering MathML in non-web applications is available to directly access and test? If so, do such test tools exist? Or does one need to gain access to the MathML content by itself (before it is integrated) to test it with syntax checker tooling? Apple's tech is typically more proprietary and not as open as other platforms, so I am curious about this.

In the AG WG meeting Dan raised the question:

What are the rules when web tech is used to view a non-web document, such as an electron app to view a PDF?

In this specific case, PDF documents are not implemented using a markup language. Andrew Kirkpatrick said 4.1.1 Parsing does not apply to PDF documents in his comment when the Task Force was analyzing this SC.

I'm not trying to give everyone a hard time about this, but what I do know is that WCAG2ICT guidance is getting a lot of scrutiny and questions from testers that have no idea how to test certain criteria in a non-web context. Developers and testers don't know how to determine whether a particular criteria truly applies.

In the AG WG meeting another question was raised by DJ:

dj: What about applications that embed pieces of HTML but aren't written in HTML (RSS reader for example)?

IMO (and others can chime in on this), you should treat the application as a single non-web software application unless you can specifically use web-based test tools on the pieces of the application that are Web-based. This problem has existed since the hybrid mobile apps and we tell our test teams to treat it as non-web software when testing.

The reasoning behind that answer: When an application embeds markup-based content into non-web software, how does a test professional know this is the case? In today's integrated environment, they likely have no idea what parts are implemented using non-web software, web-based software, or markup. They simply test the UI and test with a screen reader and/or accessibility API probing tool to ensure accessibility information is available. When testing using this methodology, they would not report issues found against 4.1.1 Parsing.

If the only way to satisfy concerns is to have notes similar to Gregg's, I would like to indicate that cases where this SC would apply to non-web tech are rare with Gregg's caveat that one would need access to test the markup or resulting DOM to find errors attributable to this SC.

patrickhlauke commented 7 months ago

just want to heavily +1 @maryjom last comment above...

benja11y commented 7 months ago

As a concrete example, MathML is a markup language which ATs such as MathPlayer directly parse the language of; it is built on top of XML

I believe that this is due to the fact that it still uses IE (in IE8 mode) if I'm reading the documentation for MathPlayer correctly. Wouldn't that make this example a main WCAG issue rather than a WCAG2ICT issue?

MathML looks to be fully available in Chrome's accessibility tree and available to AT via UIA (confirmed using Microsoft inspect).

(Note to @dbjorge and chairs, my in-app attempt to "quote reply" resulted in me inadvertently editing @dbjorge's comment, don't think I/we should really have the permission to do that! I've attempted to revert the damage I did to it...!)

bruce-usab commented 7 months ago

Please no one worry hard about WCAG2ICT guidance around 4.1.1 for the sake of U.S. Section 508. The next edition of the ICT Testing Baseline for Web already has no testing necessary.

rscano commented 7 months ago

There is a mistyping error in the text:

WCAG 2 has made this success criterion obsolete and removed it as a requirement in the standard. Therefore, the interpretation of this success criterion for non-web documents and software has been removed.

WCAG 2 should be changed in WCAG 2.2. In this point i think is important to said 2.2 instead of WCAG 2 usually used in document.

mbgower commented 7 months ago

WCAG 2 should be changed in WCAG 2.2. In this point i think is important to said 2.2 instead of WCAG 2 usually used in document.

Agreed, it is only marked obsolete in 2.2, so this is a valid change.

maryjom commented 7 months ago

The typo is fixed in PR #298

mitchellevan commented 7 months ago

As a concrete example, MathML is a markup language which ATs such as MathPlayer directly parse the language of; it is built on top of XML

I believe that this is due to the fact that it still uses IE (in IE8 mode) if I'm reading the documentation for MathPlayer correctly. Wouldn't that make this example a main WCAG issue rather than a WCAG2ICT issue?

MathML looks to be fully available in Chrome's accessibility tree and available to AT via UIA (confirmed using Microsoft inspect).

MathPlayer documentation is not wrong, but it would be clearer to say modern browsers don't require MathPlayer to expose math markup. However, NVDA currently requires an add-on like MathPlayer or MathCAT even with modern browsers (https://github.com/nvaccess/nvda/issues/16036, https://github.com/nvaccess/nvda/issues/15352), I assume because NVDA does not yet support the math parts of UIA.

Wouldn't that make this example a main WCAG issue rather than a WCAG2ICT issue?

Yes, if MathML is an issue at all for 4.1.1, then it's an issue for WCAG in general not just for WCAG2ICT. If WCAG 4.1.1 has something to say about MathML then WCAG2ICT should just say "see WCAG". And if WCAG 4.1.1 is silent regarding MathML, then WCAG2ICT should likewise just say "see WCAG".

pday1 commented 5 months ago

Copying content from current editor's draft, and also including Gregg's comment (in italics) from December, we would then have the following for SC 4.1.1.

Applying SC 4.1.1 Parsing to Non-Web Documents and Software

WCAG 2.2 Guidance:

NOTE 1 WCAG 2.2 has made this success criterion obsolete and removed it as a requirement in the standard. Therefore, the interpretation of this success criterion for non-web documents and software has been removed.

WCAG 2.0 and 2.1 Guidance:

WCAG 2.0 and 2.1 are incorporated, either directly or by reference, into other standards. Therefore, the application of 4.1.1 Parsing to non-web documents and software is to follow the new guidance provided in the WCAG 2.0 Editorial Errata and the WCAG 2.1 Editorial Errata which states the following:

This Success Criterion should be considered as always satisfied for any content using HTML or XML.

_In places where markup languages are used outside of the web, and where assistive technologies read the markup languages directly, then 4.1.1 would apply to software and non-web documents as written in 2.0 and 2.1 with content replaced with software and non-webdocuments.

NOTE 2 As in Web content, 4.1.1 Parsing is not known to have any effect on the accessibility of non-web documents or software. There are no known examples of non-web documents or software that would have an issue such as those covered by 4.1.1 Parsing. Modern assistive technology does not parse document or software markdown languages for accessibility information. User agents and platforms used to render non-web documents and software use platform accessibility APIs to present accessibility information to AT. Therefore, 4.1.1 Parsing would no longer be a requirement for accessibility.

NOTE 3 Where an existing standard requires 4.1.1 parsing for non-web documents and software, this Success Criterion would be automatically satisfied.

maryjom commented 5 months ago

Per the discussion from the 22 March meeting, we are starting to work on the above language, but the notes seem to contradict the other content, so it needs more work.

maryjom commented 4 months ago

@dbjorge @WilcoFiers @JAWS-test

The WCAG2ICT Task Force recently agreed upon some changes to the guidance for 4.1.1 Parsing that we hope will be sufficient per your concerns and discussion with Chuck some time ago. Please review the changes proposed in PR 338. You can also read the updated guidance for 4.1.1 Parsing in-context in the built netlify document for the PR which can be found in the WCAG 2.0 and 2.1 Guidance part of Applying SC 4.1.1 to Non-Web Documents and Software.

If you have suggestions for further changes, it would be most helpful if you can suggest the exact changes that would be preferable. We are getting close to publishing another public draft for review and specific suggestions will help us prevent further delays to that publication.

maryjom commented 4 months ago

See #338 for proposed changes to address comments received. Email received from both Wilco and Dan with agreement to the changes that this content change addressed their concerns. The changes were subsequently also approved by the TF on 16 May. I have opened Issue #364 for a 2nd review by the AG WG so the whole group can see what changes were made to this SC's guidance in WCAG2ICT.

Closing this issue.

w3c / wcag2ict