Open GeorgeKerscher opened 1 year ago
Hadrien Gardeur of DeMarque did a presentation about the lack of accessibility metadata at the EDItEUR Supply Chain Conference at the Frankfurt Book Fair and the slides are available for all on the EDItEUR website here: https://editeur.org/3/Events/Event-Details/667
Readium go toolkits Inferred metadata (work in progress) explores the path. We'll be happy to discuss the subject collectively.
I think this is an important issue. I see different organizations moving toward that, with the risk of different interpretations of how to do metadata infer. I think joint work will be needed to define high-level guidelines on how to analyze code to extract metadata in a consistent way across different implementations.
In terms of UX guidelines the aspect we will have to consider is whether to indicate to the end user if a piece of metadata comes from the content creator or from an inferring algorithm. To be considered is if we should add this information and with what level of granularity.
Thank you for the link to the slides @gautierchomel . Interesting to see what Readium has seen.
At look at our titles:
As mentioned by @chrisONIX, @gautierchomel and @gregoriopellegrino we've been working on various things over the last 18 months at De Marque:
The data covered in my presentation in Frankfurt comes primarily from trade publishing, which is probably quite a different dataset from what @rickj has on his side.
The logic for our inference rules is entirely open source, but I can summarize it here:
Here's the list of current rules:
textual
(on its own and not combined with other values) for accessModeSufficient
tableOfContents
for accessibilityFeature
printPageNumbers
for accessibilityFeature
MathML
for accessibilityFeature
synchronizedAudioText
for accessibilityFeature
auditory
for accessMode
visual
for accessMode
For the table of contents and page list, this could be refined by:
That said, we've seen EPUB that had good reasons for only having a smallish table of contents or a partial page list, so it's pretty hard to define a rule that works across all publications.
On the October 26,2023 call, the question came up about what to do when no or little accessibility metadata is present. Should this group be providing guidance about inferring metadata when it is not present. In the past we have said that a reflowable EPUB with a detailed nave doc is normally very accessible. It is also possible for the EPUB to be examined for accessibility features. So the issue is what guidance should we be providing about a distributor, for example, adding accessibility metadata to their catalogue that can be inferred by examination of the title?