w3c / silver

Accessibility Guidelines "Silver"
https://w3c.github.io/silver/
Other
201 stars 44 forks source link

Visual contrast of text outcome measure #334

Open sdw32 opened 3 years ago

sdw32 commented 3 years ago

From https://www.w3.org/WAI/GL/WCAG3/2020/outcomes/luminance-contrast-between-background-and-text

Outcome: Luminance contrast between background and text

Provides adequate luminance contrast (lightness/darkness difference) between background and text colors in order to read the text easily.

Luminance contrast is not a useful outcome measure in its own right. Extremely large text may be acceptable at extremely low levels of luminance contrast, whereas small text needs much higher levels of luminance contrast.

I would suggest the outcome measure for text should be based on ensuring that text can be read at a maximum speed (readability), specified in a way that can be objectively measured.

Readability measures that I am aware of include the 'Maximum Reading Speed' from the MNREAD acuity chart, and the reading speed from the 'Wilkins rate of reading test' (https://www1.essex.ac.uk/psychology/overlays/rrt%20OC4.htm). Of these, the latter might be more useful, because it uses continuous repeats of 15 random common words, so is not influenced by learning effects and can be repeatedly administered for the same participants.

It might be feasible for a developer to generate paragraphs of text containing these 15 random words, in their chosen style of text (font-name, font-size, font-weight, line spacing, foreground colour, background colour). The guideline could specify a reference text style, and the developer could test the reading speed of a person with vision loss, for both the reference text and their text, and their text passes the guideline if the rate of reading speed for their text is not less than the reference text.

It might even be possible to use a method of simulating impairment as a proxy for running this test with participants who have some degree of vision loss. For example, the guideline could specify that the test is run with a participant who has reasonably good vision, but the viewing distance is manipulated to represent a specified degree of vision loss.

The Wilkins rate of reading test is covered by existing IP arrangements, but I have been in conversation with Arnold Wilkins, and he is open to the possibility of contributing a version of this test to the WCAG guideline on a royalty free basis. I will look forward to following up on this in due course, in whatever manner would be appropriate for this.

Myndex commented 3 years ago

Hello @sdw32 thank you for commenting.

I am well aware of all of these issues. From what I gather you may have not read the actual guidelines and specifications, as most of what you stated is described there, or in the whitepaper materials which you have previously been given links to, but for the record, here: https://www.w3.org/WAI/GL/task-forces/silver/wiki/Visual_Contrast_of_Text_Subgroup

I believe you are referring to the "single line title" which is intended as the simplest description of the objective goal, but being simple leaves out all of the minutiae. That is part of how the total WCAG 3.0 document is structured, agnostic of content.

The "deepest" discussion was moved out for development into several white papers that are in progress, some of which is covered in the link I provided above. While I do have reservations regarding making some things too simple, in this case, the single line descriptor:

Provides adequate luminance contrast (lightness/darkness difference) between background and text colors in order to read the text easily.

Characterizes the objective in the simplest single line manner possible. You should study WCAG 2.x and all of WCAG 3.0 to see how these structures are applied, and the manner that context is used for displaying farious aspects of the standard. The overall structure is a completely separate discussion, and not related to content, for instance, your first several paragraphs are very clearly illustrated in the guidelines.

Also, the APCA tool directly addresses most of the rest, in an objective and testable manner.

To address some of your other assertions:

Emphasis added:

It might be feasible for a developer to generate paragraphs of text containing these 15 random words, in their chosen style of text (font-name, font-size, font-weight, line spacing, foreground colour, background colour). The guideline could specify a reference text style, and the developer could test the reading speed of a person with vision loss, for both the reference text and their text, and their text passes the guideline if the rate of reading speed for their text is not less than the reference text.

No. This is not objective, not testable in an automated manner, not "feasible" in that it's presupposed that either authors or testers create mock ups of the testable content within the narrow definition of the specific reading test, for each and every variation of style (not at all practical ina typical webpage which has 15 to 40 sizes, styles, colors) and THEN ALSO employ a "person with vision loss", and moreover fails as a single-test-subject-with-impairment for reading speed is not an instructive metric.

The statement "test the reading speed of a person with vision loss" is unsupported. There is no "one type" of "vision loss," and one impairment type's needs may (and often does) interfere with another impairment type's needs. Nothing useful is learned through this approach, nor is it consistent, nor is it feasible from a practical implementation viewpoint.

Reading speed tests

Wilkins is two and a half decades old, suffers many of the same issues of most reading speed tests, and worse, is not a simulation of actual content. MNREAD is the current widely accepted method, and there are some other newer methods and hybrid methods that are pushing the envelop farther.

How we are using reading speed

The extensive work of Dr. Lovie-Kitchin and separately Dr Legge defines important metics that affect reading speed, including the key concepts of contrast reserve, critical contrast, acuity reserve and critical acuity.

I suggest looking through that research, as it is the current reference standard, and used extensively in the development of WCAG 3.0 visual contrast and SAPC/APCA.

Thank you,

Andy

Andrew Somers W3 AGWG Invited Expert Myndex Color Science Researcher https://www.myndex.com/APCA/simple

alastc commented 3 years ago

I think there are two levels here, and I generally agree with Andy's assertions regarding the testability. If aspects of reading speed can be incorporated into the algorithm / tooling, that's going to be the most useful thing to do.

However, WCAG 3.0 allows for holistic tests as well as the more content/atomic testing that has been the focus so far. It might be that readability as tested by participants could be an additional form of testing on top of the atomic / automatic testing.

The approach from @sdw32 may or may not be the way to go, but we should be looking for holistic tests in this area.

bruce-usab commented 3 years ago

FWIW, having followed the maths (as best I can), I would not be in favor in trying to integrate aspects of reading speed directly into the SAPC/APAC scoring. The current SAPC/APAC algorithms are very objective and defensible. From my perspective, the SAPC/APAC value is similar in principle to the WCAG2x contrast ratio calculation. Scoring readability and reading speed is much more subjective.

Myndex commented 3 years ago

Hi Alastair @alastc

I think there are two levels here, and I generally agree with Andy's assertions regarding the testability. If aspects of reading speed can be incorporated into the algorithm / tooling, that's going to be the most useful thing to do.

"Reading speed" is already built in to the APCA contrast metrics, as it references are the extensive research of Dr Lovie-Kitchin and separately, Dr Legge, not to mention Dr Arditi, and others. Similarly this is the basis for the font-size recommendations.

Evaluating reading speed vs font and color metrics is an extensive clinical study under laboratory conditions that is both expensive and time consuming. Reading speed is in no way a practical tool for evaluating web content.

However, WCAG 3.0 allows for holistic tests as well as the more content/atomic testing that has been the focus so far. It might be that readability as tested by participants could be an additional form of testing on top of the atomic / automatic testing.

Yes, and I already included a "Standard Observer Model" in SAPC for making these kinds of evaluations. The problem with reading speed is that it is tied into cognitive and neurological processing. As a result it requires a large sample size. Then add that conducting such a study for evaluating reading speed requires trained observers under laboratory conditions.

Otherwise it is no better than guesswork, and therefore notwithstanding.

But as far as other test types, we have already identified important user needs which indicate the kinds of testing or evaluations needed. From Fall 2019:

Screen Shot 2019-10-01 at 1 24 34 PM

As you can see, yes, I do believe there can be more and different forms of testing.

"Reading speed" is an attempt to test "readability" listed here in the "comprehensive function tests" row, but fails in practicality and further does not achieve a solution to the user needs because (among other things) user needs for readability are frequently in direct conflict on a per user basis.

What IS useful is to identify predictors of reading speed, tracing back into well regarded readability research. This is what APCA does in part.

The SAPC standard observer model is part of the foundation for other holistic or evaluatory testing methods — but it should be clear that these are subjective in nature, and therefore need to be coupled with some objective testing form.

Thank you

Andy

Myndex commented 3 years ago

Hi Bruce @bruce-usab

_FWIW, having followed the maths (as best I can), I would not be in favor in trying to integrate aspects of reading speed directly into the SAPC/APAC scoring. The current SAPC/APAC algorithms are very objective and defensible. From my perspective, the SAPC/APAC value is similar in principle to the WCAG2x contrast ratio calculation. Scoring readability and reading speed is much more subjective._

Exactly. And even with large sample sizes, what do we learn? Are we achieving the user needs? Can we even test a complete site? Such a testing regime could cost more than the site's design — and still not achieve any real benefit.

As I've stated repeatedly: With the contrast guidance now fixed, our next step is toward enabling user-centric customization, and doing so in a practical and consistent manner.

Thank you

Andy

ChrisLoiselle commented 3 years ago

@jspellman This is ready for survey. The comments within the thread talk to more specific tests which while related to Andy's work, do not impact our current WCAG 3.0 work. Action items would be to for Silver to follow up on the related tests mentioned in the thread as it relates to readability , reading speed. Future work could be reviewed on the atomic aspects of these subjective and objective tests.

ChrisLoiselle commented 3 years ago

DRAFT RESPONSE: The comments within the thread talk to more specific tests which while related to Andy's work, do not impact our current WCAG 3.0 work.

Action items would be to for Silver to follow up on the related tests mentioned in the thread as it relates to readability , reading speed. Future work could be reviewed on the atomic aspects of these subjective and objective tests.

I hope this answers your question, if it does not, please feel free to follow-up.

Thank you, Chris

sdw32 commented 3 years ago

The Thread #339 now essentially covers the same topic as this so perhaps this could be merged