Open llemeurfr opened 5 years ago
I think that accessibilitySummary
should either be a string or a localized string rather than an array of strings.
accessModeSufficient
needs to be expressed as an array or array of strings (🙄).
accessModeSufficient
→ this one is even mega super confusing as an author.
Had to use it a few weeks ago, in my very last e-production gig and I was like “WTF‽”
Quite frankly, I hope that they redesign it at some point. Usage makes it even more difficult to understand what the definition is in the first place. 😫
Review of @JayPanoz 's current draft: https://github.com/JayPanoz/architecture/blob/a11y-metadata-parsing/streamer/parser/a11y-metadata-parsing.md
meta
elements whose property
attribute has the value ..." => not just EPUB3 meta
+ property
, but also EPUB2 meta
+ name
a11y:certifierCredential
is a meta
+ name
in EPUB2, but in EPUB3 can be meta
+ property
, or alternatively link
+ property
(in which case the value is expected to be a URL)a11y:certifierReport
is a meta
+ name
in EPUB2, but in EPUB3 it cannot be meta
+ property
, it must be a link
+ property
(the value must be a URL)dcterms:conformsTo
not link
+ rel
, but link
+ property
(in EPUB3), or meta
+ name
in EPUB2.schema:accessibilityFeature
is actually open-ended, due to the possible displayTransformability
suffixes which map to CSS rules (typically: /font-size
, /font-family
, /line-height
, /word-spacing
, /letter-spacing
, /color
, /background-color
, etc.). Also, note the missing highContrastAudio
suffixes (/noBackground
, /reducedBackground
and /switchableBackground
)dcterms:conformsTo
is strictly-speaking an open-ended choice of arbitrary URLs, it is likely one of: http://www.idpf.org/epub/a11y/accessibility-20170105.html#wcag-a
, http://www.idpf.org/epub/a11y/accessibility-20170105.html#wcag-aa
, http://www.idpf.org/epub/a11y/accessibility-20170105.html#wcag-aaa
schema:accessMode
, schema:accessibilityFeature
, schema:accessibilityHazard
and schema:accessibilitySummary
are "required" properties (in terms of validation against the a11y conformance rules)schema:accessibilitySummary
to repeat, yet the specification isn't clear about that, so there can potentially be several properties with the name/property in the EPUB package *.opf
XML (a bit like dc:title
). I think the R2 model should store them all, and it is the responsibility of the processor / consumer to figure out what to do with it (e.g. reading system can display the first one only, or a concatenation). The clearly repeatable properties are: schema:accessMode
, schema:accessibilityFeature
, schema:accessibilityHazard
, schema:accessModeSufficient
(the only one which allows comma-separated values from the enumerated list of tokens), schema:accessibilityAPI
(although currently likely just ARIA
), and schema:accessibilityControl
. I guess it makes sense for these to be repeatable as well: dcterms:conformsTo
, a11y:certifiedBy
, and a11y:certifierCredential
, but it would seem that a11y:certifierReport
should be unique ... but then again, the R2 models should be ready for the possibility of several occurrences, I think.schema:accessModeSufficient
can be repeated, and each occurrence is itself a comma-separated list of tokens from the enumeration. The current draft proposes to store these individual values as an array of tokens, rather than as the original linearized string. I am not so sure about this approach (I speak based on my own experience having implemented an editor for accessibility metadata), I think we should just naively preserve the original string value, with all its potential "weirdness" (e.g. insignificant whitespaces - or lack thereof - between tokens and comma separators, token ordering, duplicates, etc.)References:
Note that r2-shared-js
implements the above (nothing fancy, just boring repetitive parsing code), with careful handling of EPUB 2 name
+ content
versus EPUB 3 property
metadata, and of course special handling of metadata link
+ property
for dcterms:conformsTo
, a11y:certifierReport
and optionally a11y:certifierCredential
.
Code references:
Side note: I do not know what the W3C webpub accessibility-report
is, in relation to the specs linked above.
* **To be debated**: `schema:accessModeSufficient` can be repeated, and each occurrence is itself a comma-separated list of tokens from the enumeration. The current draft proposes to store these individual values as an array of tokens, rather than as the original linearized string. I am not so sure about this approach (I speak based on my own experience having implemented an editor for accessibility metadata), I think we should just naively preserve the original string value, with all its potential "weirdness" (e.g. insignificant whitespaces - or lack thereof - between tokens and comma separators, etc.)
Note that the W3C draft spec. breaks down individual tokens in the linearized comma-separated enumeration for the accessModeSufficient
property:
https://www.w3.org/TR/pub-manifest/#accessibility https://www.w3.org/TR/pub-manifest/#webidl-wpm
{
…
"accessMode" : ["textual", "visual"],
"accessibilityFeature" : ["alternativeText", "longDescription"]
"accessModeSufficient" : [
{
"type" : "ItemList",
"itemListElement" : ["textual", "visual"]
},
{
"type" : "ItemList",
"itemListElement" : ["textual"]
}
],
…
}
The current draft proposes to store these individual values (
schema:accessModeSufficient
) as an array of tokens, rather than as the original linearized string. I am not so sure about this approach ...
So, in r2-shared-js
I added a convenient utility helper function to decompose and normalize the original/authored AccessModeSufficient
string (i.e. raw linearized comma-separated value, when parsed from EPUB) into a canonical "array-of-(array-of-(string))" form, with removed insignificant whitespace, eliminated duplicates, and preserved order (the duplicates are removed on the trailing edge of the matching iteration).
Unfortunately due to a limitation in the declarative JSON (de)serialization library used for the R2 models, I was not able to directly implement array-of-array (array-of-object works fine, we use it a lot, but because of how prototypal class inheritance works in Javascript, array-of-array seems a no-go) ... thus the convenient, but separate helper.
Thorium / readium-desktop
will invoke this utility function as needed, in order to present the accessibility metadata as per the standard UX guidelines:
https://w3c.github.io/publ-a11y/UX-Guide-Metadata/techniques/schema-org.html
PS Javascript code:
const AccessModeSufficientParsed = AccessModeSufficient.map((ams) =>
ams.split(",").
map((token) => token.trim()).
filter((token) => token.length).
reduce((pv, cv) => pv.includes(cv) ? pv : pv.concat(cv).
filter((arr) => arr.length), []);
Example input/output:
["", " visual , textual ,, visual ", "auditory, auditory,,"]
=>
[["visual","textual"],["auditory"]]
Aside from purely parsing and representing these metadata, I think that the real question remains: what can we actually use them for?
IMO the community around EPUB, has failed so far to build compelling use cases of how these various properties can be leveraged.
I'd rather have less metadata and know what to actually make of them.
the real question remains: what can we actually use them for?
https://w3c.github.io/publ-a11y/UX-Guide-Metadata/techniques/schema-org.html
@danielweck thanks for the review.
I must admit that I wasn’t particularly confident/comfortable with this draft, as accessibility metadata in EPUB isn’t necessarily my forte – and well that was an external contribution in Blitz whose default was modified later as having everything by default instead of a reasonable subset might have well produced unreliable a11y metadata – so I’m indeed expecting quite a lot of massive changes to this draft.
the real question remains: what can we actually use them for?
https://w3c.github.io/publ-a11y/UX-Guide-Metadata/techniques/schema-org.html
Sure that's better than nothing, but beyond displaying these metadata, how can we truly leverage them?
Sure that's better than nothing,
translate: this is already great :-)
beyond displaying these metadata, how can we truly leverage them?
Use them (I mean the mapped information, e.g. "Screen reader friendly") as filters in reading app bookshelves is the next step.
Unfortunately due to a limitation in the declarative JSON (de)serialization library used for the R2 models, I was not able to directly implement array-of-array (array-of-object works fine, we use it a lot, but because of how prototypal class inheritance works in Javascript, array-of-array seems a no-go) ... thus the convenient, but separate helper.
This is now fixed properly, so that the JSON syntax is optimal without the need of a helper function.
Another point of interest, cross-walk project (EPUB, Schema.org and ONIX): http://www.a11ymetadata.org/the-specification/metadata-crosswalk/ https://docs.google.com/spreadsheets/d/e/2PACX-1vTBWK6YwcDNYQTjE5dodNsMaIqRDUWu9SLsNwiaAZIrGn3BKa7iVlnTM6Nw5aU_qFKMUBcThEXlQAds/pubhtml
Summary of various useful references thus far:
http://kb.daisy.org/publishing/docs/metadata/schema-org.html http://kb.daisy.org/publishing/docs/metadata/evaluation.html https://www.w3.org/wiki/WebSchemas/Accessibility https://www.w3.org/TR/pub-manifest/#accessibility https://w3c.github.io/publ-a11y/UX-Guide-Metadata/techniques/schema-org.html
PS: I am not sure about the accessibility-report
link, which seems close to a11y:certifierReport
?
https://www.w3.org/TR/pub-manifest/#accessibility-report
During the 24/04/2019 call, the discussion led to:
https://w3c.github.io/publ-a11y/UX-Guide-Metadata/principles/
Can we agree this is the way to go?