TEIC / Stylesheets

TEI XSL Stylesheets
231 stars 124 forks source link

minOccurs=0 fails to work on `<attDef>` #557

Closed sydb closed 1 year ago

sydb commented 2 years ago

The @prefix attributes of <schemaSpec>, <elementSpec>, and <moduleRef> are defined the same way:

<schemaSpec>:

      <datatype minOccurs="0">
        <dataRef key="teidata.xmlName"/>
      </datatype>

<elementSpec>:

<datatype minOccurs="0"><dataRef key="teidata.xmlName"/></datatype>

<moduleRef>:

      <datatype minOccurs="0" maxOccurs="1"><dataRef key="teidata.xmlName"/></datatype>

In all three cases, the attribute generated is attribute prefix { xsd:NCName }? (although in the <moduleRef> case it is surrounded by parenthesis, which I do not understand, as the RELAX NG XML syntax looks the same). That seems to me to be a problem, because the definition clearly says that the value can be empty (i.e., contain zero teidata.xmlNames), but the schema says it cannot.

What makes this really weird is that in all other cases where an attribute is defined using a @minOccurs of '0'[1] the RELAX NG is correct, e.g.: attribute otherLangs { list { teidata.language* } }?. OH! But I have just discovered a major difference, evident in the footnote: the ones that work have a @maxOccurs of "unbounded".

Notes [1] The 10 attributes defined with @minOccurs of '0' are:

      2 in except, max=unbounded
      2 in include, max=unbounded
      2 in prefix, max=
      1 in atts, max=unbounded
      1 in otherLangs, max=unbounded
      1 in prefix, max=1
      1 in type, max=unbounded
sydb commented 1 year ago

This error raised its ugly head and bit me this afternoon, and is blocking progress on TEI#2347. So I have generated a) generated a tiny test ODD to demonstrate the problem; b) attached the ODD, its RNC output, and an XML instance to this ticket; c) shown the relevant excerpts below; d) changed the name of this ticket; and d) raised its priority.

Input ODD contains an attribute list that includes

      <attDef ident="req_01" usage="req">
    <datatype minOccurs="0" maxOccurs="1">
      <dataRef key="teidata.word"/>
    </datatype>
      </attDef>

      <attDef ident="opt_01" usage="opt">
    <datatype minOccurs="0" maxOccurs="1">
      <dataRef key="teidata.word"/>
    </datatype>
      </attDef>

      <attDef ident="req_11" usage="req">
    <datatype minOccurs="1" maxOccurs="1">
      <dataRef key="teidata.word"/>
    </datatype>
      </attDef>

      <attDef ident="opt_11" usage="opt">
    <datatype minOccurs="1" maxOccurs="1">
      <dataRef key="teidata.word"/>
    </datatype>
      </attDef>

The expected output is

    attribute req_01 { xsd:token { pattern = "[^\p{C}\p{Z}]+" }? },
    attribute opt_01 { xsd:token { pattern = "[^\p{C}\p{Z}]+" }? }?,
    attribute req_11 { xsd:token { pattern = "[^\p{C}\p{Z}]+" }  },
    attribute opt_11 { xsd:token { pattern = "[^\p{C}\p{Z}]+" }  }?,

Note the locations of the question marks. The actual output is

    attribute req_01 { xsd:token { pattern = "[^\p{C}\p{Z}]+" }  },
    attribute opt_01 { xsd:token { pattern = "[^\p{C}\p{Z}]+" }  }?,
    attribute req_11 { xsd:token { pattern = "[^\p{C}\p{Z}]+" }  },
    attribute opt_11 { xsd:token { pattern = "[^\p{C}\p{Z}]+" }  }?,

Note the lack of the 1st two question marks.

Stylesheets_issue_557_demo.zip

raffazizzi commented 1 year ago

@sydb before I start working on this, is this problem only with the compact syntax or does it happen in the XML syntax as well? Because the rnc is generated from the XML via trang, so if the XML is correct, this is a bug with trang

sydb commented 1 year ago

Both compact and XML. (No such luck, @raffazizzi! :-)

Only 1 of them:

      <attribute name="req_01">
        <a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
        <data type="token">
          <param name="pattern">[^\p{C}\p{Z}]+</param>
        </data>
      </attribute>

which should be

      <attribute name="req_01">
        <optional>
          <data type="token">
            <param name="pattern">[^\p{C}\p{Z}]+</param>
          </data>
        </optional>
      </attribute>
raffazizzi commented 1 year ago

Sent pull request. datatypes with maxOccurs="1" skipped cardinality. It was necessary to add a special case for minOccurs="0" with maxOccurs="1" to output rng:optional

sydb commented 1 year ago

Wow. That was friggin’ fast.