w3c / publ-a11y

Accessibility related discussions of the Publishing@W3C Groups
Other
25 stars 6 forks source link

What to display when there is a conflict between ONIX and Schema.org accessibility metadata #189

Open GeorgeKerscher opened 1 year ago

GeorgeKerscher commented 1 year ago

On the call on October 26, 2023 the problem was brought up about what an implementation should display when there is a conflict between the ONIX and Schema.org accessibility metadata present inside the EPUB. This assumes that the system is ingesting the EPUB and has access to the accessibility metadata inside the EPUB. This seems like a problem that is sure to come up and we should provide guidance.

gregoriopellegrino commented 1 year ago

In my opinion this is not so much a problem related to showing metadata, but a problem related to ingestion of files and metadata and should be solved/controlled at that stage of the process. From the experience of Fondazione LIA the problems that can happen with accessibility metadata are two:

Just as an EPUB file will not be distributed if it does not pass EPUBCheck without error, similarly we think that stringent control over accessibility metadata should be provided to address upstream inconsistency issues.

We are working on this as the Fondazione LIA.

clapierre commented 1 year ago

Only human QA can ultimately determine which metadata is accurate. Now some automation could help in determining if a feature exists or doesn't exist and which metadata to believe OPF's schema.org or the ONIX metadata, but again this is very experimental at this point.

Only the accessibilityHazards have the negative value for a particular value, including unknown (i.e. sound, noSoundHazard, unknownSoundHazard). We don't have this for the various accessibilityFeatures, so the issue will be if the metadata exists or doesn't exist and what to believe in this case.

Complicating things is there is not yet a complete 1:1 mapping between schema.org and ONIX although a number of new values have just been added to ONIX and this is getting updated in our crosswalk.

I can think of a number of scenarios where either the OPF metadata should be used as the authority as publishers haven't yet upgraded their ONIX pipeline with the newly emerging additional metadata. Likewise I could see (and have seen first hand) a publisher copy & pasting incorrect metadata into their EPUBs (eg. mathML when there is no math within the EPUB)

Also ONIX updates I believe happen more frequently as its easier to send a new ONIX update with improved metadata which can potentially fix issues with the OPF's version, than a new version of the EPUB, but I will defer to publishers on this point.

I doubt we can pick one, and the solution is somewhere in the middle. Do we first take everything the OPF says in its accessibility metadata as this is currently the most complete, and then flag those portions which are in conflict according to what is in the ONIX feed and display both to the user as we can't resolve what is the authoritative version?

rickj commented 1 year ago

I don't believe we can come up with an automated method to make this determination. I think our answer has to be in the User Experience Guidelines. Currently we accept metadata from:

we currently display them all thru tabs at the top of the detail window, and let the end user make a determination on validity of the claims. (I realize this detail window will need to change with the new recommendations.)

We are considering how to implement a differential check, and report back to the publisher that they may have an issue they want to clean up.

Screenshot 2023-11-01 at 7 56 29 PM
rickj commented 1 year ago

Our current thoughts on how to incorporate the new guidelines (and maintain the three tabs)

Screenshot 2023-11-01 at 8 03 13 PM
gautierchomel commented 1 year ago

Related to issue #191.

Few actors are crosschecking both metadata sources and the topic may often be related to private contracts.