TEIC / Stylesheets

TEI XSL Stylesheets
228 stars 124 forks source link

ODD processing and custom attributes (of existing names) #237

Open pdaengeli opened 7 years ago

pdaengeli commented 7 years ago

The stylesheets (ODD to RNG) seem in some cases to produce output that is invalid. This happens when trying to create attributes in a custom namespace, that already exist with the same local-name in the tei namespace (e.g. @n, @type, @target, @unit, @source; it works for others such as @subtype or @select).

I get the following two types of errors:

Is this an intentional limitation? Shouldn't non-TEI namespaces have full autonomy?

Or am I approaching this the wrong way? I tried with:

<classSpec ident="att.global" type="atts" mode="change" module="tei">

    <attList>

        <attDef ident="n" ns="http://www.cceh.uni-koeln.de/hal/ns/1.0">

            <desc>The 'hal:n' attribute stores legacy information to keep track of the origin of the text value of its parent element.</desc>

            <datatype>

                <rng:text/>

            </datatype>

        </attDef>

    </attList>
</classSpec>

which led to the errors mentioned above, but also with <attDef ident="hal:n" ns="http://www.cceh.uni-koeln.de/hal/ns/1.0"&gt;, which in fact allowed to define any attribute, however the namespace prefix was undefined in the output.

Some hints on how to set up such attributes (of the same name but in a custom namespace) in an ODD specification would be appreciated. I couldn't find much information on this (other than the inclusion of externally defined namespaces such as svg).

martindholmes commented 7 years ago

One minor point: regular TEI attributes are not in the TEI namespace, they're in the empty namespace. By definition, all unprefixed attributes are in the empty namespace; they don't inherit the namespace of their parent element. That's an XML thing, not a TEI thing.

pdaengeli commented 7 years ago

Ok, that makes sense. But an attribute in a namespace should generally be able to co-exist with an unqualified one of the same local-name, shouldn't it?

jamescummings commented 7 years ago

@pdaengeli Definitely an attribute in a namespace with the same local name should be able to co-exist with an unqualified one with the same local-name. I'll try to do some tests today to see if I can duplicate the bug.

lb42 commented 7 years ago

I can confirm that I can replicate this bug/feature of the stylesheets. If you add a non-TEI element with the same name as an existing TEI element, you can use the attribute @prefix on <elementSpec> to ensure that the pattern generated from the name in the RelaxNG remains unique. But there is no analogous facility for attributes, partly because TEI attributes don't belong to any namespace (as Martin points out), partly because nobody thought of providing it. You could of course just call your new attribute "notN" or something to show that it's different.

jamescummings commented 7 years ago

Just to say that I can confirm this as well. I added some new attributes to the TEI element using the tei_math.odd exemplar as a starting point (since I wanted something that worked using another namespace already. ;-) ) Adding a new attribute in another namespace of course worked fine. Adding a new attribute in another namespace that has the same local-name as an existing TEI element did not work.

The ODD I used is: https://gist.github.com/jamescummings/155429676474d9416b8251c717776657

And the define element in the RNG it produced was:

 <define name="tei_TEI">
      <element name="TEI">
         <a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0">(TEI document) contains a single TEI-conformant document, combining a single TEI header with one or more members of the model.resourceLike class. Multiple TEI elements may be combined to form a teiCorpus element. [4. Default Text Structure 15.1. Varieties of Composite Text]</a:documentation>
         <group>
            <ref name="tei_teiHeader"/>
            <oneOrMore>
               <ref name="tei_model.resourceLike"/>
            </oneOrMore>
         </group>
         <ns xmlns="http://purl.oclc.org/dsdl/schematron"
             prefix="tei"
             uri="http://www.tei-c.org/ns/1.0"/>
         <ns xmlns="http://purl.oclc.org/dsdl/schematron"
             prefix="xs"
             uri="http://www.w3.org/2001/XMLSchema"/>
         <ns xmlns="http://purl.oclc.org/dsdl/schematron"
             prefix="rng"
             uri="http://relaxng.org/ns/structure/1.0"/>
         <ref name="tei_att.global.attribute.xmlid"/>
         <ref name="tei_att.global.attribute.xmllang"/>
         <ref name="tei_att.global.attribute.xmlbase"/>
         <ref name="tei_att.global.attribute.xmlspace"/>
         <ref name="tei_att.global.rendition.attribute.rend"/>
         <ref name="tei_att.global.rendition.attribute.style"/>
         <ref name="tei_att.global.rendition.attribute.rendition"/>
         <ref name="tei_att.global.responsibility.attribute.cert"/>
         <ref name="tei_att.global.responsibility.attribute.resp"/>
         <ref name="tei_att.global.source.attribute.source"/>
         <ref name="tei_att.typed.attributes"/>
         <optional>
            <attribute name="n" ns="http://www.w3.org/1998/Math/MathML">
               <a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
               <choice>
                  <value>NValue</value>
                  <a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0">NValue</a:documentation>
               </choice>
            </attribute>
         </optional>
         <optional>
            <attribute name="blort" ns="http://www.w3.org/1998/Math/MathML">
               <a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
               <choice>
                  <value>foo</value>
                  <a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0">foo</a:documentation>
               </choice>
            </attribute>
         </optional>
         <optional>
            <attribute name="n" ns="http://www.w3.org/1998/Math/MathML">
               <a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"/>
               <choice>
                  <value>NValue</value>
                  <a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0">NValue</a:documentation>
               </choice>
            </attribute>
         </optional>
         <optional>
            <attribute name="version">
               <a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0">specifies the major version number of the TEI Guidelines against which this document is valid.</a:documentation>
               <data type="token">
                  <param name="pattern">[\d]+(\.[\d]+){0,2}</param>
               </data>
            </attribute>
         </optional>
         <empty/>
      </element>
   </define>

Where you can see there are two optional/attribute elements for n attribute in the mathml namespace. For some reason our processing is doing its job twice here.

However, I note that if I delete this additional optional element for the mathml:n attribute, it doesn't suddenly let me use the TEI global n attribute here. So two things are wrong I think.

pdaengeli commented 7 years ago

Thank you for testing, @jamescummings and @lb42.

Perhaps the problem pointed out by @lb42 is related to mode="add". I left this out in my ODD (which is very close to this gist) following a hint on TEI-L by Sebastian Rahtz from long ago: https://listserv.brown.edu/archives/cgi-bin/wa?A2=TEI-L;95cce31d.0605

Did you also try this with @subtype or @select? Other than @n they don't seem to clash (both using Lou's and my example).

For my actual use case I'll use @hal:desc, which is unproblematic.

sydb commented 3 years ago

Current plan is for the TEI Stylesheets group to discuss this at our next meeting, scheduled for Thu 28 Jan 21. I will try to do some experimentation and send a heads-up to the group ~1 week from today.

sydb commented 3 years ago

Another starting point ODD for testing things. Note that the file has “.txt” appended to the name to convince GitHub it is safe. (I have to figure out how to use gists some day, I guess.) The output RELAX NG generated (at least on my system) is not valid, because the prefix used in some of the @idents is not defined. Am I forgetting something about how to tell ODD about the prefix?

martindholmes commented 3 years ago

@sydb @npcole and I tested a parallel problem with elements, and discovered that if you add an element in a different namespace but with the same local-name as a TEI element, not only is there no way to refer to it using an elementRef, but when you generate the schema, the pattern for that element is not duplicated, it's completely absent. So we're thinking that an approach to both these problems might be to allow @ns on <attRef>, <elementRef> etc., and have a rule whereby if it's NOT supplied the stylesheets assume you mean the empty namespace (for an attribute) or the TEI namespace (for elements).

Here's a sample ODD:

<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:hcmc="http://hcmc.uvic.ca/ns">
  <teiHeader>
      <fileDesc>
         <titleStmt>
            <title>Title</title>
         </titleStmt>
         <publicationStmt>
            <p>Publication Information</p>
         </publicationStmt>
         <sourceDesc>
            <p>Information about the source</p>
         </sourceDesc>
      </fileDesc>
  </teiHeader>
  <text>
      <body>
         <schemaSpec ident="myTEI">
            <moduleRef key="tei"/>
            <moduleRef key="core"/>
            <moduleRef key="textstructure"/>
            <moduleRef key="header"/>

            <elementSpec ident="p" ns="http://hcmc.uvic.ca/ns" mode="add">
               <gloss>Funny para</gloss>
               <desc>Very funny para</desc>
               <content>
                  <textNode/>
               </content>
            </elementSpec>

            <elementSpec ident="div" mode="change">
               <content >
                  <elementRef key="hcmc_p"/>
               </content>
            </elementSpec>

      </schemaSpec>
      </body>
  </text>
</TEI>
dmj commented 2 years ago

I think the root of this problem is that @ident serves a double purpose:

  1. It is used to identify the element or attribute specification.
  2. It is used to give the local name of the specified element or attribute.

So we're thinking that an approach to both these problems might be to allow @ns on , etc., and have a rule whereby if it's NOT supplied the stylesheets assume you mean the empty namespace (for an attribute) or the TEI namespace (for elements).

This would mix both functions: The namespace URI belongs to 2, not 1.

I think it would be better to use @source on elementRef et al. It is already used to identify an element etc. specification by means of referring to an external file.

Another option discussed in #2285 is to use altIdent to give the local name of the specified element.

<schemaSpec ident="myTEI">
            <moduleRef key="tei"/>
            <moduleRef key="core"/>
            <moduleRef key="textstructure"/>
            <moduleRef key="header"/>

            <elementSpec ident="hcmc_p" ns="http://hcmc.uvic.ca/ns" mode="add">
               <altIdent>p</altIdent>
               <gloss>Funny para</gloss>
               <desc>Very funny para</desc>
               <content>
                  <textNode/>
               </content>
            </elementSpec>

            <elementSpec ident="div" mode="change">
               <content >
                  <elementRef key="hcmc_p"/>
               </content>
            </elementSpec>

      </schemaSpec>