Fyrd / caniuse

Raw browser/feature support data from caniuse.com
https://caniuse.com
Creative Commons Attribution 4.0 International
5.6k stars 1.38k forks source link

Accessibility of in-browser PDF Viewer #6179

Open brennanyoung opened 2 years ago

brennanyoung commented 2 years ago

I welcome the appearance of data on in-browser PDF Viewers, resulting from #2212

Different browsers handle pdf in different ways, and the various PDF viewers have different capabilities, and this can have an impact on accessibility conformance.

AFAIK, only pdf.js used by Mozilla makes an effort to communicate PDF content to the accessibility tree (i.e. the subset of the DOM which is communicated to assistive tech). It makes a translation from PDF/UA tags to ARIA roles and attributes.

If people are deciding to use or not use PDF-in-browser on the basis of caniuse data, I believe they should be informed of different levels of accessibility support.

LifeIsStrange commented 2 years ago

Note that a website can embed PDF.js and it works on chromium browsers since it is an HTML5 renderer. However most users probably prefer the default experience of their browser.

brennanyoung commented 2 years ago

@LifeIsStrange That's exactly the kind of information that would be useful to see on caniuse.

BTW, pdf.js relies on aria-owns to construct an accessibility tree. Quite a cunning solution, except that aria-owns is not supported at all in Safari.

Given that Acrobat and Preview also fail to generate such a tree, this means that at time of writing there are no PDF viewers that run on any of Apple's platforms (inside or out of browser) which communicate the tree to the system level accessibility API.

This has an impact on the defacto portability of this nominally portable file format.

Malvoz commented 2 years ago

ref: https://github.com/accessibilitysupported/a11ysupport.io/issues/222

brennanyoung commented 1 year ago

Just a FYI: if you open a semantically-well-formed HTML5 document in chrome, and print to PDF (using the default mechanism for this) you will get a nearly semantically-well-formed PDF/UA. Headings and lists are getting tagged correctly, at least.

However, there are still some issues - I reported several on the chromium bug database yesterday. Lots of bogus <NonStruct> tags are getting generated, which are relatively harmless (similar to role="generic").

Unfortunately several meaningful semantics such as article and section are also getting mapped to <NonStruct>, even tho PDF/UA has <Art> and <Sect> available.

Creating PDF is not really the bread-and-butter of caniuse, but this is a REALLY good development. It means that Chrome is a viable authoring tool for accessible PDF, which plays well with (e.g.) Acrobat and NVDA.

However, the default PDF view in Chrome does not seem to generate a proper accessibility tree at time of writing.

LifeIsStrange commented 1 year ago

@LifeIsStrange That's exactly the kind of information that would be useful to see on caniuse.

BTW, pdf.js relies on aria-owns to construct an accessibility tree. Quite a cunning solution, except that aria-owns is not supported at all in Safari.

Given that Acrobat and Preview also fail to generate such a tree, this means that at time of writing there are no PDF viewers that run on any of Apple's platforms (inside or out of browser) which communicate the tree to the system level accessibility API.

This has an impact on the defacto portability of this nominally portable file format.

@jensimmons friendly ping

brennanyoung commented 1 year ago

Update - I'm having some success with semantic browsing in Preview and VoiceOver! Not sure what has changed or when. (The PDF document used matters a great deal, of course). I haven't seen any announcements from Apple about this feature. Very obvious that things behave differently to the web, but at least there is a minimal implementation. I hope it will be fleshed out.

brennanyoung commented 1 year ago

Sketching out a test profile for consumption (not authoring).

This will not be exhaustive, but it will get us moving. I'm using the nomenclature as it appears in Acrobat, or in the Tagged PDF Best Practice Guide I've broken these into categories but the breakdown is open to adjustment. I imagine one test PDF per category, or something like that. (Please advise on the wisdom of this, or offer any suggestions for further enrichment/value).

I imagine each of these as pass/fail. A "pass" is if the AT announces the element and (if non-generic) the role. For tree exclusions, a "pass" is if the AT does not announce the content.

I expect that we will need to document/express partial pass (with remarks) in some cases, but we'll cross that bridge later.

Essential metadata (level A)

note: as in HTML, the Lang attribute may be applied to almost any other tag, including those with generic semantics such as Div and Span, so we should test for SC 3.1.2 "Language of Parts" too), especially with a mixed lang document. A "pass" here would be (e.g) for a screen reader's speech synth to use the correct phonemes (if available). Not sure if there are similar criteria that could be used for (e.g.) Braille devices. Advice welcome.

Basic Block Level Semantics

Required for WCAG SC 4.1.2: Name, Role, Value

Inline semantics

Required for WCAG SC 1.3.1: Info and Relationships and SC 4.1.2: Name, Role, Value.

Links, References and Annotations

Required for WCAG SC 1.3.1: Info and Relationships and SC 4.1.2: Name, Role, Value.

Structural Semantics

Required for WCAG SC 1.3.1: Info and Relationships and SC 4.1.2: Name, Role, Value.

Text Alternatives

Required for WCAG WCAG SC 1.1.1: Non-text Content.

Exclusions from the Accessibility Tree

To be considered/explained/understood before testing