nlbdev / nordic-epub3-dtbook-migrator

Tools for converting between a strict subset of DTBook and EPUB3.
http://nlbdev.github.io/nordic-epub3-dtbook-migrator/
GNU Lesser General Public License v2.1
8 stars 7 forks source link

Titles are validating even without <!DOCTYPE html> #540

Open oscarlcarlsson opened 1 year ago

oscarlcarlsson commented 1 year ago

I just notices that files turn out as valid even though they are missing <!DOCTYPE html> in the validator. I believe that this has been an issue for a long time, since several of our enriched files are missing it. According to 2.5.1.2 Document Type Declaration in the 2020-1 guidelines, this is required. Is this left out on purpose?

josteinaj commented 1 year ago

Checking the doctype (or the XML declaration) is not possible using RelaxNG og Schematron. The validator that used XProc and Pipeline 2 used Java to extract the doctype and XML declaration, but also had a fallback to a pure XProc/XSLT implementation (although it was less efficient).

@kalaspuffar do you know if the doctype validation was copied over from the old code (or reimplemented)?

kalaspuffar commented 1 year ago

Hi.

We have added a card for this in Trello and will address it in the Prio order.

Best regards Daniel

oscarlcarlsson commented 1 year ago

I just had a look at some files where <!DOCTYPE html> was not included. and they still pass through the validator. Any thoughts on if this should be allowed or not?

josteinaj commented 1 year ago

The XML and DOCTYPE declarations are required in the guidelines, so they should also be required in the validator. Both for 2015-1 and 2020-1.