daisy / pipeline-modules

Modules for the DAISY Pipeline project
3 stars 4 forks source link

DAISY Ace reporting duplicate metadata after ePub3 Enhancement #81

Open GrayWolfMT opened 5 months ago

GrayWolfMT commented 5 months ago

In producing the CSUN Conference program for 2024, using the 1.14.17-p1 release of the PipeLine App via the dp2 Command line, Ace reported an issue with the schema:accessibilityHazard meta data item. Ace is reporting an "invalid value" of "none | none"; when I look at the produced content, that is because that metadata property is duplicated.

To produce this content, I start with clean XHTML, which includes the metadata needed for the ePub for accessibility; then I used the html-to-epub3 script to transform that to a text only ePub, which correctly included the meta data item. After that, I used epub3-to-epub3 to add the TTS content to the original to get an audio ePub. When using ACE to check the output, the text only ePub is fine, but the audio ePub has the duplicated meta item in the package.opf file.

I'm attaching screen shots from the package file in the text only and audio ePubs, along with the HTML source, text ePub, audio ePub and ACE report from the audio ePub for reference.

TextEpub-HazardMeta EnhancedEpub-HazardMeta

General-SourceHTML.zip General-SourceEpub.zip

General-Audioepub.zip General-report.zip

bertfrees commented 5 months ago

@GrayWolfMT OK now I understand better. The metadata item is not really duplicated, as in that there would be two identical meta elements. What happened here is that the script automatically added meta elements with name and content attributes, for compatibility with OPF 2. @rdeltour I'm surprised that Ace complains about it.

bertfrees commented 3 months ago

@rdeltour Are the compatibility meta elements not supported by Ace?