libero / publisher

The starting point for raising issues for Libero Publisher
MIT License
16 stars 4 forks source link

Libero schemas are malformed #159

Closed GiancarloFusiello closed 5 years ago

GiancarloFusiello commented 5 years ago

The RelaxNG schema files that reside in the libero/schemas repository are missing the start element/tag under the grammar element/tag. According to the RelaxNG schema this tag is required.

I discovered this issue when trying to use the schemas for validation in another project but I'm surprised this was not picked up by the schema tests indicating that the testing method may need to be revised.

Fix:

giorgiosironi commented 5 years ago

Comparing https://relaxng.org/relaxng.rng and https://relaxng.org/spec-20011203.html, they both say:

<define name="grammar-content">
<interleave><ref name="other"/><zeroOrMore><choice>
    <ref name="start-element"/>
    <ref name="define-element"/>
    <element name="div"><ref name="common-atts"/><ref name="grammar-content"/></element>
    <element name="include"><attribute name="href"><data type="anyURI"/></attribute><ref name="common-atts"/><ref name="include-content"/></element>
</choice></zeroOrMore></interleave></define>

giving 3 valid alternatives to start; e.g. https://github.com/libero/schemas/blob/master/api/content/item.rng has both includes and defines. What is an example of a .rng that is missing start and should have it?

giorgiosironi commented 5 years ago

Worth noting that Section 5 in https://relaxng.org/spec-20011203.html states:

After applying all the rules in Section 4, the schema will match the following grammar:
<grammar> <start> top </start> define* </grammar>

which makes start mandatory after e.g. all includes have been resolved.

giorgiosironi commented 5 years ago

Unclear to me whether libxml2 (the C library that lxml binds to) implements this simplification process itself or not.

giorgiosironi commented 5 years ago

http://gnome-lib.996298.n3.nabble.com/Possibly-incomplete-step-4-7-of-the-RELAXNG-simplification-process-td27643.html seems to say so, but no official documentation.

GiancarloFusiello commented 5 years ago

Current error from lxml is: lxml.etree.RelaxNGParseError: Element <grammar> has no <start>, line 6

giorgiosironi commented 5 years ago

My best guess so far, looking at the <start> elements all being in tests/, is that the <grammar>s in api/, core/, extensions/ are not for direct use for validation, but only a library to build your own schema. The schema is then dependent on the journal. If this hypothesis is correct, <grammar>s without a <start> are only allowed if they are to be included in other <grammar>s.

For example, the 3 documents in https://github.com/libero/schemas/tree/master/tests/api/content/item are examples of how a item can be configured to be. So there should be a missing step here of building the elife-style schema (it may be elsewhere but that uses Schematron).

GiancarloFusiello commented 5 years ago

@thewilkybarkid Can you confirm if @giorgiosironi is correct above?

thewilkybarkid commented 5 years ago

My best guess so far, looking at the <start> elements all being in tests/, is that the <grammar>s in api/, core/, extensions/ are not for direct use for validation, but only a library to build your own schemas.

Is the right answer. The idea is that the documents are your own, they just have a top-level requirement (eg /libero:item/libero:meta).

This is complex though, so part of #153 is rethinking this entirely...

GiancarloFusiello commented 5 years ago

Is the right answer. The idea is that the documents are your own, they just have a top-level requirement (eg /libero:item/libero:meta).

This is complex though, so part of #153 is rethinking this entirely...

@thewilkybarkid When you have completed your investigation, can you please update https://github.com/libero/libero/issues/161 with the process for validation?

giorgiosironi commented 5 years ago

So, I agree this should be closed once https://github.com/libero/libero/issues/161 is up-to-date.

thewilkybarkid commented 5 years ago

Going to close now since it's not a bug.