Open davidofyork opened 3 months ago
- speed limit is “not too fast”
- stairs should support “a lot of full sized adults on them at the same time”
We can't put things in as requirement if we don't have a pass/fail criterion that has a high inter-rater agreement when you are not at or very near a boundary condition.
The numeric threshholds you replace with comparatives in your examples is a false equivalence and a rhetorical distraction.
re @electronicwoft ???? false equivalence and a rhetorical distraction. ???
How is it a false equivalency.. It was demonstrating quite clearly that regulations need to have measuerable numbers even if arbitrary. (why is 60 mph so much safer than 63 that we set it at 60. It isn't. but it is a definite threshold that everyone can basically agree on (however - also as I pointed out - when you get near the boundary (near 60), you will get lots of people measuring a car's speed as different around that. (One reason they don't ENFORCE violations near 60. You must be going enough faster that everyone would agree that you were "more than 60 by some bit) before the enforce. )
So how it is a false equivalency when I am saying they are not equivalent?
Also what was rhetorical about i?
Please, let's stick to making points and counterpoints and not dismissing other people's comments as a distraction.
We make subjective judgements all the time and it's rarely a problem. It's required by the normative text of numerous WCAG success criteria, such as:
SC 1.1.1: "text alternative that serves the equivalent purpose". In many cases, we have no idea what the author's purpose was. Increasingly, the text alternative came from a digital asset management system, so there was no "purpose" when it was created. The tester has to decide what they think the purpose is.
SC 1.2.1, 1.2.2 and 1.2.3: "except when the audio or video is a media alternative for text and is clearly labeled as such". How would you define "clearly labelled" in objective terms that are measurable?
SC 1.2.5: "Audio description is provided for all prerecorded video content in synchronized media." In practice, it is rare for audio description to be provided for all content, even when using extended audio description. In many cases, it would be a really bad user experience. In some cases, it wouldn't even be possible to create audio description for all content. Instead, we create audio descriptions containing the important visual content, not all of it.
SC 3.1.3: "phrases used in an unusual or restricted way, including idioms and jargon".
Maybe the root of this argument is that some people advocate a de-skilled and automatable process that gives 100% repeatable results, while others advocate a skill-based process that is capable of producing better results but is subject to greater variability. The former sounds attractive, but you only have to look at the awful results produced by the Trusted Tester methodology to see how badly it can work. Removing subjective criteria hides the issue rather than solving it. In a fuzzy world, there isn't going to be a "right" answer and we're not all going to agree.
@GreggVan
Linguistic comparatives are not equivalent to numeric absolutes hence it being a false equivalency.
I thought that was self-evident hence the rhetorical distraction - apoligies.
Success criteria are not numerical expressions and most of them would be utterly worthless if they were formulated as such.
It's time to accept that a little bit of ambiguity is inevitable.
The question then becomes how best to minimise the effects of that ambiguity without forcing evry success criterion to be logically consistent or formulaic even though there object cannot be effectively quantised or absolutised.
Not every success criterion can be 1.4.3 nor should they be.
Nor do success criteria require absolutes to be reliably, consistently, and successfully tested.
As stated above, the best way to minimise the effects of ambiguity is to provide definitions and preferrably normative ones to guide practitioners towards achieving outcomes as consistently as possible.
So let's get back to how best to disambiguate 2.x?
Related is #4036
When any topic reaches a certain number of comments, I find it becomes so weighted down with side comments that it becomes difficult to resolve. I believe this has occurred here.
In an effort to address (and possibly resolve?) this issue, I will attempt to list what I'm hoping are guiding facts, which everyone can agree on:
In other words, these changes were well scrutinized and supported.
While this is a more subjective statement, it's also worth pointing out that the wording changes in the PR were not large-scale nor did they introduce new concepts in the Understanding document.
Finally, I challenge the narrative there was a pre-existing accuracy/correctness factor to this SC that was substantively altered by this change and needs to be "restored". Such language clearly does not exist in the normative text. A careful scrutiny of the PR challenges the assertion that it existed within the understanding document. The phrase "clear and unambiguous" was removed in two of the benefit examples which I suspect is at the heart of this issue. Yet it is debatable whether this phrase is referring to the content of a label or its placement/affiliation. Hopefully it's obvious that sufficient techniques like G162: Positioning labels to maximize predictability of relationships and G167: Using an adjacent button to label the purpose of a field challenge the interpretation that this (or the word "identify" in the definition of label) is mainly or solely about semantics/language.
This issue is proposing altering the normative text of 3.3.2 to "make an explicit expectation of accuracy". Put simply, we cannot do this. The task force is prevented from proposing any normative changes that alter the requirements of the published standards.
Further, I believe it has been demonstrated that such a change is unnecessary. Any lack of accuracy or clarity in a label can be failed against 2.4.6. That is not new guidance. It is clearly stated in the last line of the pre-existing Intent section in the 3.3.2 Understanding document (as well as elsewhere), which I quoted in my draft working group response.
Beyond proposing this normative change, the issue argues that if there is no qualitative measure for 3.3.2, there is no point to it:
By removing any need for clarity or quality or accuracy in 3.3.2, then what is the benefit of conforming to this SC?
However, it is clearly illustrated in both language, examples and techniques that the presence of a label allows a number of other considerations to take place, namely:
A missing input label can clearly be failed against 3.3.2. A missing label cannot easily be failed against the language of either 2.4.6 (whose sole technique has a test that only applies "For each interface with a label") or 1.3.1 (which relies on a label existing in the presentation in order to enforce a programmatic relationship).
As in my draft response, I acknowledge that any task force work is constrained not just by its work statement, but by the reality of an imperfect standard made more imperfect by a lot of historical authors writing inconsistent non-normative guidance over more than a decade. In my opinion, these adopted changes to the 3.3.2 Understanding document lead to more consistent inter rater reliability: Inaccurate or poorly described labels are reported against 2.4.6; missing labels are reported against 3.3.2.
If after taking in this response, @davidofyork or anyone else still feels a change is needed, I encourage them to create a PR, which can then enter into the same WCAG 2 process I outlined above.
Alternatively, as I've already noted, I strongly agree with the tactic of using accuracy as a measure in WCAG 3, and I also encourage you to become more involved in helping bring that to fruition.
Thanks for the comprehensive summary, @mbgower! I agree, with a discussion of this length, not to mention the numerous distractions, it’s easy to lose focus.
What you and others still seem to be overlooking is the specific role and intended purpose of SC 3.3.2 within the overall framework of WCAG.
As I’ve pointed out several times in this thread, 3.3.2 falls under:
SC 3.3.2 was designed with a purpose, an intention.
This was captured in the stated goal that "Users know what information to enter" and the intent "to have content authors present instructions or labels that identify the controls in a form so that users know what input data is expected." Presumably those words were well scrutinized and supported too?
Suggesting that the excised phrase “clear and unambiguous” or the word “identify” in the WCAG definition of a label referred not to the content of a label but to its placement/affiliation is a spurious claim that, again, is at odds with the stated goal and intention of 3.3.2.
SC 3.3.2 has a job to do – to help users avoid and correct mistakes within the context of an understandable interface.
It certainly wasn’t designed to be a mere prerequisite to other SCs.
As for what should be done about this, there are various options. In no particular order:
This issue doesn’t actually propose altering the normative text, it concerns revisions to the non-normative Understanding document. I understand the normative text is written in stone and cannot be adjusted.
This certainly is possible and #4056 proposes how it could be achieved. It would result in some overlap between 3.3.2 and 2.4.6, which I know some members are concerned about, but personally I don’t think it would be as confusing as people fear. A failure of 3.3.2 due to an inaccurate label would result in an automatic failure of 2.4.6.
The current definition of ‘label’ states: “text or other component with a text alternative that is presented to a user to identify a component within Web content.” As @jared-w-smith pointed out, the need to “identify a component” should be enough to make the case for accurate labels, but I’m aware some will claim that a component can still be identified from a meaningless ID number (or proximity to a label). So perhaps the definition could be tweaked to reflect a need for accuracy? Note: This is the approach taken with 1.2.5 Audio Description (Prerecorded), which concerns the existence of audio description but includes a linked definition of audio description that implies a need for quality.
This would be difficult to achieve within the constraint of the normative text. I did consider whether 2.4.6 could focus on more elaborate, descriptive headings and labels but that is challenged by the statement “Labels and headings do not need to be lengthy.”
Having heard members’ concerns about the subjectivity and value judgements required by 2.4.6, with one member implying that the SC was a mistake, I mooted whether 2.4.6 should be deprecated in the manner of 4.1.1. However, I understand this would be highly unlikely to gain traction and it's doubtful we'll see another 2.x version of WCAG.
This is a very subtle but achievable solution. A similar approach is taken in 1.2.2 Captions (Prerecorded), where the normative/non-normative text is concerned only with the existence of captions. However, one of its associated failures (F8) concerns the quality of captions, suggesting that the SC is also concerned with quality.
As I suggested in #4056, this could be something along the lines of: “While this Success Criterion requires that controls and inputs have labels or instructions, it is crucial that these labels or instructions are not only present but also accurate.”
This is the least disruptive approach but the easiest to achieve. It wouldn’t enforce accuracy but it would at least acknowledge and restate the goal and intent of 3.3.2.
I urge members to evaluate these options now rather than postponing the consideration of accuracy to future guideline versions. With WCAG 3 still far off, the issues we face today will continue to affect us for some time.
It certainly wasn’t designed to be a mere prerequisite to other SCs.
What's wrong with an SC being a prereq for another? That's how many work. We are all agreeing there is overlap. A decision was made to implement minor tweaks in guidance to reduce that overlap.
In my prior comment I outlined where a tester/assessor would record various failings of label, be they due to absence, language, or programmatic connection (and the understanding document already does that as well). What is the actual negative outcome of that current guidance on an end-user's experience? What is negative from a tester or reporting perspective?
You seem to be concerned that more things will be reported under 2.4.6 and less under 3.3.2, but in what way is that actually a problem? Several of your proposed options include just reversing this relative emphasis -- or even completely eliminating 3.3.2. How is that any better?
Suggesting that the excised phrase “clear and unambiguous” or the word “identify” in the WCAG definition of a label referred not to the content of a label but to its placement/affiliation is a spurious claim that, again, is at odds with the stated goal and intention of 3.3.2
It is not a spurious claim to say there is ambiguity; there are aspects of this SC concerned with visual identification and presentation. How else to explain sufficient techniques like Positioning labels to maximize predictability of relationships?
Thank your for the PR. I have associated it with this issue and added it to the TF process.
@mbgower What's wrong with an SC being a prereq for another? That's how many work.
Really? Which ones?
I'm not aware of any SCs that serve no other purpose than to tee up a different SC.
You seem to be concerned that more things will be reported under 2.4.6 and less under 3.3.2, but in what way is that actually a problem? Several of your proposed options include just reversing this relative emphasis -- or even completely eliminating 3.3.2. How is that any better?
No, my intention (and the motivation behind my proposed options) is to ensure that SC 3.3.2 honors its parent guideline, 3.3 Input Assistance, by helping users avoid and correct mistakes, and supports its parent principle, by ensuring that information and the operation of the user interface is understandable. By making 3.3.2 about the mere provision of labels, it supports neither the guideline nor the principle.
Honestly, why is this so objectionable? I assume the WCAG framework of principles and guidelines was created for a purpose and that SC were allocated accordingly. So why wouldn't you want to uphold this?
I have no concerns about things being reported under one SC more than the other and I certainly haven't suggested eliminating 3.3.2.
Suggesting that the excised phrase “clear and unambiguous” or the word “identify” in the WCAG definition of a label referred not to the content of a label but to its placement/affiliation is a spurious claim that, again, is at odds with the stated goal and intention of 3.3.2
It is not a spurious claim to say there is ambiguity; there are aspects of this SC concerned with visual identification and presentation. How else to explain sufficient techniques like ?
Oh, there's no doubt that sufficient techniques like Positioning labels to maximize predictability of relationships will support identification of form controls (although note that G162 is only sufficient when used with G131: Providing descriptive labels). But my point was that the excised phrase “clear and unambiguous” or the word “identify” is concerned more with the content of a label. A logically positioned label will not alone identity a form control.
Thank your for the PR. I have associated it with this issue and added it to the TF process.
As stated above, at the very least I think the Understanding SC 3.3.2 document should include a note emphasizing the need for accuracy. It wouldn’t enforce accuracy, it shouldn't impact your attempt at addressing the overlap with 2.4.6, but it would at least acknowledge and restate the goal and intent of 3.3.2.
As stated above, at the very least I think the Understanding SC 3.3.2 document should include a note emphasizing the need for accuracy. It wouldn’t enforce accuracy, it shouldn't impact your attempt at addressing the overlap with 2.4.6, but it would at least acknowledge and restate the goal and intent of 3.3.2.
Bravo!
I'm not aware of any SCs that serve no other purpose than to tee up a different SC.
Just because you keep saying 3.3.2 serves no other purpose doesn't make it true. I and others have shown what its purpose is and how it can be used with the guidance as it now reads, and why the existence of a label is a critical thing. 3.3.2 can serve a purpose and serve as a prereq to something else. They are not exclusive. Much of the A > AA > AAA structure is constructed on a prereq model.
I assume the WCAG framework of principles and guidelines was created for a purpose and that SC were allocated accordingly. So why wouldn't you want to uphold this?
I believe the POUR model was conceived to make WCAG a neater package. Unfortunately, the allocation of SCs into a poorly normalized POUR model (in the data modelling sense) is often flakey. 2.4.6, "Headings and labels describe topic or purpose," can serve as an immediate example. It's located under Operable! That's why the focus of discussion is on the normative text of the requirements (which is challenging enough).
I get the sense from a few comments that I'm viewed as some solitary recalcitrant blocker in the way of progress. Yes, I classified this as a response-only issue back when it appeared back in July. I offered a draft response explaining why, and then I stepped aside for over two months and let other participants offer more than 50 comments. There are people who support what you are saying, and there are people disagreeing with you. Now that we have a PR, folks can try to iterate to something that gets traction.
Both the Understanding documents are absent any occurrence of the word "accuracy". As I said before, I think it is a measurement that can garner inter rater reliability, and I've been heavily advocating for its use in WCAG 3. I get that there are notions of 'appropriateness' in the 3.3.2 understanding document. But ultimately, my position is that alterations in the Understanding document must be based on the normative language, and the basis just doesn't seem to be there to me.
One approach, which I assume you won't like, is to add accuracy to 2.4.6. For instance, the final paragraph of 3.3.2's Intent section could read:
While this Success Criterion requires that controls and inputs have labels or instructions, whether or not labels (if used) are accurate, sufficiently clear, or descriptive is covered separately by 2.4.6: Headings and Labels.
2.4.6 does have normative language that can supports this addition, if one can be persuaded that "descriptive" includes such a notion. To me it is defensible to point to a sentence like this and argue it includes a notion of accuracy:
Conversely, it is also possible for content to pass Success Criterion 1.3.1 (with headings or labels correctly marked up or identified), while failing this Success Criterion (if those headings or labels are not sufficiently clear or descriptive).
it's very disappointing that changes to the normative component - even the glossary - are forbidden until WCAG 3.0 ... As I have experienced, much of WCAG 2.x has taken on a life of it's own in practice, and its letter is given scant attention.
So I wonder whether changing a handful of words in the non-normative text of any success criterion really has any impact given the messaging about the value of the informative components of WCAG 2.x in determining conformance.
And, with thisI say adieu to this issue ...
I'm not aware of any SCs that serve no other purpose than to tee up a different SC.
Just because you keep saying 3.3.2 serves no other purpose doesn't make it true.
Then, by all means, Mike, please show me!
I and others have shown what its purpose is and how it can be used with the guidance as it now reads, and why the existence of a label is a critical thing.
The only thing you have demonstrated is how (in your words) “the presence of a label allows a number of other considerations to take place”, namely SCs 2.4.6 and 1.3.1.
Gregg attempted to justify the mere existence of labels as benefiting people with cognitive disabilities, but as I pointed out, the label still needs to be accurate.
3.3.2 can serve a purpose and serve as a prereq to something else. They are not exclusive. Much of the A > AA > AAA structure is constructed on a prereq model.
Again, could you give me an example of an SC that exists purely as a prerequisite for another SC while serving no other purpose?
I assume the WCAG framework of principles and guidelines was created for a purpose and that SC were allocated accordingly. So why wouldn't you want to uphold this?
I believe the POUR model was conceived to make WCAG a neater package. Unfortunately, the allocation of SCs into a poorly normalized POUR model (in the data modelling sense) is often flakey. 2.4.6, "Headings and labels describe topic or purpose," can serve as an immediate example. It's located under Operable! That's why the focus of discussion is on the normative text of the requirements (which is challenging enough).
This is what I find so frustrating. On the one hand, WCAG is this untouchable monolith and we must work carefully within the constraints of its normative text. But on the other hand, we can be all loosey-goosey about what goes where, with principle/guidelines seemingly having no bearing on their associated SCs. The principles and guidelines are just as normative as the success criteria, and should be upheld just as rigorously.
(For the record, Principle 2: Operable states: “User interface components and navigation must be operable.” Within that, Guideline 2.4 Navigable states: “Provide ways to help users navigate, find content, and determine where they are.” And 2.4.6 concerns the provision of descriptive headings and labels to support navigation. Its categorization is not as illogical as you are implying. And I suspect that, given the relative importance of headings when it comes to navigation, labels did not figure as prominently in the formation of this SC. But that’s by the by.)
One approach, which I assume you won't like, is to add accuracy to 2.4.6.
Actually, I have no objection to adding accuracy to 2.4.6 - just not at the expense of adding it to 3.3.2. :smile:
Issue (and related PR) discussed on TF call today. Consensus on call is that this issue is best addressed by updating Understanding 2.4.6. @mbgower volunteered to review and propose something.
Then, by all means, Mike, please show me!
I run across a lack of labels frequently. Here's one I encountered while trying out a tool this week.
Two dropdowns without labels. Without 3.3.2, where would you fail this?
Again, could you give me an example of an SC that exists purely as a prerequisite for another SC while serving no other purpose?
I have now shown you an example of how 3.3.2 does serve a purpose. I have never said there are SCs that exist purely as a prereq and serve no other purpose. Those are your words.
Its categorization is not as illogical as you are implying.
I never stated it was illogical. I said "flakey". I stated the data model was not normalized, so that the categories overlap. Do you not think a strong case could be made to put 2.4.6 under the Understanding principle? It's not difficult to construct a narrative to put numerous SCs into different principles. One ends up with many:many relationships all over the place.
This is what I find so frustrating. On the one hand, WCAG is this untouchable monolith and we must work carefully within the constraints of its normative text. But on the other hand, we can be all loosey-goosey about what goes where, with principle/guidelines seemingly having no bearing on their associated SCs. The principles and guidelines are just as normative as the success criteria, and should be upheld just as rigorously.
You are not alone in being frustrated by the constraints under which anyone trying to improve WCAG 2 has to operate. I said as much in my draft response. But the task force has clear constraints, and my responses have attempted to make those explicit. It's also worth noting that over 200 issues have been address in less than a year. There are improvements being made.
In regard to your comments about the principle and guidelines, the reality is that those are not used for testing. (In the words of the technical document, "The guidelines are not testable, but provide the framework and overall objectives to help authors understand the success criteria and better implement the techniques.") The normative text of the success critieria forms the nucleus of comformance statements, and it is the SC text against which the ACT rules are applied. I don't think it's a coincidence that the European model stripped out all the content at the principle and guideline levels, and used POUR strictly as heading structures.
@mbgower Two dropdowns without labels. Without 3.3.2, where would you fail this?
You would satisfy 3.3.2 by providing labels for the two dropdowns, but those labels wouldn't have to identify the dropdowns in a meaningful way. For example, the labels could say: "Please select".
As I have stated so many times in this discussion, merely providing labels (with no concern for accuracy) does not benefit anyone.
It doesn't meet the stated goal that "Users know what information to enter", nor does it meet the stated intent "to have content authors present instructions or labels that identify the controls in a form so that users know what input data is expected."
And yes, I'm sure you would punt this over to 2.4.6 and fail it there instead.
But this demonstrates how 3.3.2 now exists purely as a prerequisite for another SC while serving no other purpose.
I don't think I can restate this any clearer. And sadly, it's clear from the group's response that it's not going to gain any traction.
But this demonstrates how 3.3.2 now exists purely as a prerequisite for another SC while serving no other purpose.
This seems to come down to perspective. I think we both agree that the current structure is subpar. We seem to agree that the only place to fail a missing label is 3.3.2. But you feel that that serves no purpose, because the quality of the label is judged elsewhere.
I see the potential for similar divisions between existence and quality gaining more traction in WCAG 3, because while it is easy to get people to agree that, say, an image has an ALT text, it is much less easy to get people to agree on what "equivalent" means. Going way back to the start of this discussion, we also agree that accuracy serves as a useful middle point in this, which can more easily fit into a baseline measure: there is an ALT, and it is accurate (i.e., it doesn't say "dog" when it is a cat). Whether it is equivalent, and how that is determined, could be tackled at something above baseline.
It's clear that you and others believe that quality can get put into 3.3.2 as it exists, and that others like me think the SC language doesn't support that. Regardless of which SC it is filed against, we both seem to agree that the existing requirements can cover this; that there is no hole.
Based on discussions at the TF, I am undertaking to work some concepts of quality into Understanding document updates (with 2.4.6 stated as the place to assess and pass/fail). I'll flag you on those so you can review. I'm on vacation as of tomorrow, so it'll be a couple of weeks.
A recently published update to the Understanding document for 3.3.2 Labels or Instructions has made changes that effectively render the success criterion pointless.
In an attempt to disambiguate the slight overlap between 3.3.2 Labels or Instructions and 2.4.6 Headings and Labels, #1792 removed several references to the need for "clear and unambiguous" labels and instructions. The justification was that 2.4.6 concerns the quality of headings and labels, whereas 3.3.2 concerns the mere presence of labels or instructions (regardless of how good/comprehensive/accurate they are). This, I feel, was a misstep.
As others and I commented elsewhere, these changes go against the intent (if not the normative wording) of 3.3.2.
The normative text - "Labels or instructions are provided when content requires user input" - is unhelpfully terse. However, the stated goal that "Users know what information to enter" and the intent "to have content authors present instructions or labels that identify the controls in a form so that users know what input data is expected" is more indicative of the intended purpose of 3.3.2.
By removing any need for clarity or quality or accuracy in 3.3.2, then what is the benefit of conforming to this SC?
(Also, if any requirement for clarity or descriptiveness is punted squarely to 2.4.6, why list G131: Providing descriptive labels as a sufficient technique?)
Picking up from #3795 (comment), 3.3.2 now effectively says that anything goes, as long as it looks like a label or instruction. How does that help users?
Without any requirement for accuracy or correctness...
<label for="tel">XWPSWTLNMZ</label>
<label for="tel">Name</label>
<label for="consent">I agree</label>
Now, you could argue that these examples impact everyone and therefore can be considered "bad UX" and beyond the scope of WCAG. But if that's the case, why have this success criterion at all? Who does the mere provision of labels or instructions actually benefit?
It would be like saying that 1.1.1 Non-text Content should only be about the presence of a text alternative, not about whether the text alternative accurately describes the content. Admittedly, 1.1.1 has the additional clause of "a text alternative that serves the equivalent purpose" and 1.1.1 also hasn't been split over two closely related success criteria, but that's where we are. In my opinion, 3.3.2 would greatly benefit from a similar qualifying clause, if not in the normative text then in the Understanding document.
At the very least, accuracy should be an explicit expectation of 3.3.2 (in that, the label or instruction must identify/explain what the control does). Whether it does that clearly or sufficiently descriptively is then a matter for 2.4.6 (although I accept that an accurate/correct label will likely be sufficiently descriptive for 2.4.6 in most cases).
In terms of specific actions, I wouldn't necessarily reverse the changes in #1792, but instead make an explicit expectation of accuracy. So, for example:
And perhaps even an additional sentence to emphasize the need for accuracy: