TEIC / TEI

The Text Encoding Initiative Guidelines
https://www.tei-c.org
Other
269 stars 88 forks source link

Schematron to avoid contradictory use of `@require` and `@except` #2357

Open martindholmes opened 1 year ago

martindholmes commented 1 year ago

At the ATOP meeting today (https://github.com/TEIC/atop/wiki/Meeting-notes#meeting-2022-10-05) @dmj and I looked at the attributes @require and @except on the <anyElement> element (https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-anyElement.html), and realized that it's possible to simultaneously required a namespace and exclude all elements inside it, by including the same namespace in both attributes. That's contradictory, and would be easy to prevent with a simple Schematron rule like the one for <moduleRef> which says "It is an error to supply both the @include and @except attributes". It would be helpful to have this Schematron in place so that the ODD processor doesn't have to do this check every time.

dmj commented 1 year ago

Addendum: The rule should also raise an error if @‍require contains a namespace URI listed in the @‍defaultExceptions of the containing schemaSpec.

martindholmes commented 1 year ago

@sydb suggests that the two attributes are mutually exclusive. However, I can't see where that is enforced, and this seems perfectly valid:

<anyElement xmlns:ex="https://www.example.com" require="https://www.example.com" except="ex:thing"/>

But if they are exclusive, should they be? There are potentially useful combinations of them (require all elements to be in namespace x, but exclude three elements from that namespace).

dmj commented 1 year ago

I can't find that they are exclusive, too. To the contrary. The content model of anyElement reads:

element anyElement
{
   ...,
   attribute require { list { teidata.namespace+ } }?,
   attribute except { list { teidata.namespaceOrName+ } }?,
   empty
}
dmj commented 1 year ago

Addendum 2: The rule should also raise a warning if a combination of @‍require and @exclude doesn't make sense. I.e. require namespace A and exclude elements from namespace B.

sydb commented 1 year ago

The are most definitely defined as exclusive attributes:

  <attList org="choice">
    <attDef ident="require">
      <desc versionDate="2016-11-28" xml:lang="en">supplies a list of namespaces to one of which the
        permitted elements must belong.</desc>
      <datatype minOccurs="1" maxOccurs="unbounded">
        <dataRef key="teidata.namespace"/>
      </datatype>
    </attDef>
    <attDef ident="except">
      <desc versionDate="2016-11-28" xml:lang="en">supplies a list of namespaces or prefixed element 
        names which are not permitted.</desc>
      <datatype minOccurs="1" maxOccurs="unbounded">
        <dataRef key="teidata.namespaceOrName"/>
      </datatype>
      <remarks versionDate="2017-05-11" xml:lang="en">
        <!-- ... -->
      </remarks>
    </attDef>
  </attList>

As you point out, they don’t wind up exclusive in the output RELAX NG. The bozo who complained about org="choice" not working in Stylesheets issue 144 said it works on elements, but not in classes. This is solid evidence it does not work in some cases in an element, too. In which case, I think the priority of that ticket should be shifted much higher.

ebeshero commented 1 year ago

@HelenaSabel I'm assigning this one to both of us depending on which one is busier. (I do have more open tickets, but would be happy to help out with this on a call if atop group wants this sooner rather than later!)

sydb commented 10 months ago

I submit there are only three possible solutions (when the same namespace appears on @except and @require of a given <anyElement>)

  1. @require wins, the namespace is one of those required
  2. @except wins, elements from that namespace are not allowed
  3. throw an error
  4. <anyElement> represents no element (roughly equivalent to <empty>)

Council subgroup (HC, SB, EBB, MS, & NC) thinks throwing an error (3) is probably the kindest thing to do.

Made GREEN for @ebeshero to write the prose that says “don’t do this, it is an error” and @HelenaSabel and the ATOPpers to implement in ATOP and raise a Stylesheets ticket.