nlbdev / nordic-epub3-dtbook-migrator

Tools for converting between a strict subset of DTBook and EPUB3.
http://nlbdev.github.io/nordic-epub3-dtbook-migrator/
GNU Lesser General Public License v2.1
8 stars 7 forks source link

Invalid HTML for dtbook:p inside dtbook:epigraph #368

Closed egli closed 5 years ago

egli commented 5 years ago

The problem with epigraphs is not so much that we can have a dtbook:epigraph inside a dtbook:p (as described in #362 but that we can have dtbook:p inside dtbook:epigraph. This of course will also translate to html:p inside html:p which is not valid HTML5.

We propose to render dtbook:epigraph as html:blockquote as recommended by WAI-ARIA. This will solve this issue and at the same time #362.

See also sbsdev/pipeline#11

josteinaj commented 5 years ago

Sounds good. Although, in the nordic migrator I think we should use epub:type="epigraph" instead of role="doc-epigraph". Also, we should test to make sure that epigraphs are still converted back from HTML to DTBook properly.

egli commented 5 years ago

Hm, forgot about backtranslation. What's funny though is that epub3-to-dtbook.xsl is expecting an html:aside. On the other hand dtbook-to-epub3.xsl is producing html:p. So a round-trip doesn't seem to work anyway.

Not sure if we are interested in epub3-to-dtbook. But presumably for us it would have to convert to dtbook:epigraph while you want it to convert to dtbook:p. Correct?

josteinaj commented 5 years ago

Yeah, epigraph is actually disallowed in the rules we inherited from MTM into the nordic DTBook schematron: https://github.com/nlbdev/nordic-epub3-dtbook-migrator/blob/master/src/main/resources/xml/schema/mtm2015-1.sch#L472

The fact that there's a rule for it in dtbook-to-epub3.xsl is probably because: "why not?"

I suggest leaving the backtranslation as is: i.e. create a <p> in DTBook, as it's not allowed in nordic DTBook. Just so we don't break anyones systems (although unlikely).

For DTBook to EPUB3, it's initially fine to convert as you suggest (as it's non-standard anyway). But how do we handle validation of the result?

You probably have to set assert-valid=false when running the conversion anyway so that you get the epigraphs through as input, so you'll probably also get a HTML or EPUB as output even though there are validation errors. Will you validate the output with your own schemas? Or in some way ignore errors about the epigraph?

I suppose we could allow blockquote both as a inline and as a block element in the RelaxNG, and then use Schematron to assert that blockquote in a inline context is only allowed if it has the epub:type="epigraph"?

josteinaj commented 5 years ago

Fixed in v1.4.0.