nlbdev / nordic-epub3-dtbook-migrator

Tools for converting between a strict subset of DTBook and EPUB3.
http://nlbdev.github.io/nordic-epub3-dtbook-migrator/
GNU Lesser General Public License v2.1
8 stars 7 forks source link

Allow arbitrary order of header elements. #536

Open kalaspuffar opened 1 year ago

kalaspuffar commented 1 year ago

Hinderburg creates epubs that have an order for the elements that is not allowed.

 <head>
    <meta charset="UTF-8" />
    <title>Bibliotekariestereotypen i populärkulturen</title>
    <meta name="dc:title" content="Bibliotekariestereotypen i populärkulturen" />
    <meta name="dc:creator" content="Richard Ohlsson" />
    <meta name="dc:identifier" content="V002631" />
    <meta name="dc:format" content="ePub 3.0" />
    <meta name="dc:publisher" content="MTM" />
    <meta name="ncc:generator" content="Narrator Studio 1.50.2450" />
    <meta name="viewport" content="width=device-width" />
    <link rel="stylesheet" type="text/css" href="style.css" />
  </head>

The order in the beginning of the file should be:

    <meta charset="UTF-8" />
    <title>Bibliotekariestereotypen i populärkulturen</title>
    <meta name="dc:identifier" content="V002631" />
    <meta name="viewport" content="width=device-width" />

After that, we allow any or none of the meta or style link items.

The problem is that we can't interleave elements using RNG if they are the same element. So we can't interleave 3 meta elements as they aren't unique.

kalaspuffar commented 1 year ago

Hi @josteinaj

How do we handle this? Remove the RNG validation and rewrite it in Schematron?

Or do you have a better solution here?

Best regards Daniel

josteinaj commented 1 year ago

Could it be done with a <choice> in RNG?

josteinaj commented 1 year ago

Also, is this the 2015-1 guidelines?

kalaspuffar commented 1 year ago

Hi @josteinaj

I'm not sure if we can do it with choice. Then you need to say choose any of these and we need 4 of them. But we could have duplicates and then one could be missing.

Best regards Daniel

kalaspuffar commented 1 year ago

Also, is this the 2015-1 guidelines?

No clue what Hindenburg creates, never worked with it.

josteinaj commented 1 year ago

Ok. I haven't actually used Hindenburg myself, but as far as I know: Hindenburg does as little as possible to the EPUB. So the EPUB you put in, will be the EPUB you get out, except for the additon of media overlay and some other changes. I don't think it forces the output into the nordic markup guidelines.

Rewriting the requirement to Schematron seems sensible. I compared the 2015-1 and 2020-1 guidelines now. The 2015-1 guidelines say that charset, title, dc:identifier and viewport must occur as the four first children in that specific order. But the 2020-1 guidelines only requires dc:identifier and viewport, and they have no positional requirement.

So yes, for the 2020-1 validation you can rewrite the checks for dc:identifier and viewport as schematron rules. The 2015-1 validation you can leave as-is.

oscarlcarlsson commented 1 year ago

The files we have used for these validation checks have all been with the 2020-1 guidelines.