w3c / a11y-discov-vocab

Repository for the maintenance of the schema.org accessibility property values for discoverability.
https://www.w3.org/community/a11y-discov-vocab/
Other
15 stars 8 forks source link

taggedPDF definition #40

Closed madeleinerothberg closed 2 years ago

madeleinerothberg commented 2 years ago

Currently defined as: <The structures in a PDF have been tagged to improve the navigation of the content.>

This is a bit misleading. An untagged PDF presents no content to a screen reader. It isn't just inadequate navigation/structure.

Proposed change: <The contents of a PDF have been tagged to permit screen reader access.>

It could be more general, such as "AT access" or "to ensure accessibility". Other AT that might need it would be read-aloud tools, for example.

mattgarrish commented 2 years ago

An untagged PDF presents no content to a screen reader.

It should still give you access, no? There's just no telling how the content will be read out or the structures interpreted. It all depends on how the text is laid out and how the PDF viewer tries to read that into a structure on its own.

The focus on navigation is a bit misleading, though, as this gets down to the logical reading order potentially being a mess when the tagging has to be inferred.

madeleinerothberg commented 2 years ago

An untagged PDF has literally no readable text for AT.

Ok, you can run the tagging service inside Acrobat to do your own conversion, and then, yes, it's a question of reading order quality, navigation quality, image and heading tags possibly wrong, etc. But if you are using Preview or some other PDF reading tool, you won't have that option and you won't have any content.

GeorgeKerscher commented 2 years ago

There is metadata available if it conforms to PDF UA, but one will be hard pressed to find many with this metadata. However, it should be mentioned. Sorry if this is already there.