daisy / pipeline-modules

Modules for the DAISY Pipeline project
4 stars 5 forks source link

Major code cleanup, improved validation reporting, and NIMAS support #58

Closed bertfrees closed 1 year ago

bertfrees commented 1 year ago

While working on NIMAS support I stumbled on a lot of side issues, which led to a big code cleanup and various improvements to validation related steps. See the individual commit messages for more details.

There are actually only very few changes I did for NIMAS. After all, NIMAS is almost identical to DTBook. The main difference is that the meta element must be empty in NIMAS. For this reason, when the input is NIMAS (which you can indicate with a new option for the dtbook-to-zedai, dtbook-to-html and dtbook-to-epub3 scripts), no MODS file is generated in the px:dtbook-to-zedai step because it would otherwise be empty and therefore invalid.

I also fixed one thing in the NIMAS schema that Nicole said was wrong.

I found various issues in px:dtbook-to-zedai that can result in an invalid ZedAI output (for a valid DTBook input). These issues are not related to NIMAS, and are not addressed yet. For now I disabled the validation of the intermediary ZedAI in px:dtbook-to-html and px:dtbook-to-epub3, because the validation issues in the ZedAI do not necessarily result in bad HTML so the user should not be bothered with it.

The reported issues in dtbook-to-rtf and dtbook-to-odt were not addressed yet. It is quite likely that these are not related to NIMAS either.