TEIC / TEI

The Text Encoding Initiative Guidelines
https://www.tei-c.org
Other
278 stars 88 forks source link

phrase-level TEI elements should be allowed inside <sch:assert> and <sch:report> #1582

Open sydb opened 7 years ago

sydb commented 7 years ago

We are completely inconsistent about how we refer to elements, attribute names, and attribute values inside the prose of Schematron error messages and warnings. Seems to me we already have elements for those very items (<gi>, <att>, and <val>), so we should be using them. Then the ODD processor can decide whether an <att> should be preceded by an @ or followed by an = (or neither or both).[1]

HOWEVER, our content model of <sch:assert> and <sch:report> (and of <sch:diagnostic>, but we’re not currently using that) does not permit TEI elements within. This is surprising: AFAIK Schematron allows elements from foreign namespaces. Thus I consider this a bug we simply should fix.

[1] If it really wanted to, the ODD processor could leave those elements as-is, and expect that the Schematron processor would handle them appropriately. That’s probably the most appropriate way to handle this, and also the least likely to happen. :-)

lb42 commented 7 years ago

Schematron won't permit elements from foreign namespaces unless those namespaces are declared in the schematron schema, obvs. So this requires a bit more tweaking than simply allowing you to put <tei:gi> into your sch:asserts [1]. And why do you say "our content model of " ? if "we" is the TEI I'm not aware that we define such a beast anywhere. [1] or so I infer from https://www.xml.com/pub/a/2003/11/12/schematron.html#Namespaces_and_Schematron

hcayless commented 7 years ago

Action on @sydb to provide an example.

emylonas commented 7 years ago

Conference call discussion indicates that we should decide on a Schematron comment style guide - do we want to encourage the use of markup in comments or do we prefer simple text? Once we decide this then we can either return to this ticket or suggest that the BPTL schematron tickets should change how they are written. Please discuss

sydb commented 7 years ago

Sorry, @lb42 , I should have been a more specific. The content models in question are those of <sch:assert> and <sch:report> in tei_odds. I have to admit, I do not entirely understand what’s going on. In tei_odds we re-define macro.schemaPattern (to be ( text | ( rng:pattern | rng:define )+ ), anySchematron), but I don’t think the definition of of it or of anySchematron is referenced anywhere in tei_odds.rnc. Instead, we have <constraint> defined using the pattern anyElemnt-constraint, which explicitly disallows elements from the TEI namespace. (Twice: once in RELAX NG, and again in a Schematron constraint.)

So I think this boils down to our incorrect solution to the “conflicting ID-types for attribute "id"” problem biting us in the backside.

lb42 commented 6 years ago

The content model of <constraint> is

<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <anyElement require="http://purl.oclc.org/dsdl/schematron"/>
 </alternate>
</content>

which does indeed preclude using tei:tag etc. within it. Are you suggesting changing this to something like

<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef ident="model.giLikeElementsSydWantsToUseInConstraints"/>
  <anyElement require="http://purl.oclc.org/dsdl/schematron"/>
 </alternate>
</content>

(I made up the name of the class for humorous effect, obvs)

ebeshero commented 6 years ago

Hugh's point at Council meeting 2017-10-26: The generated Schematron needs to be accounting for the TEI namespace--it's a Stylesheets issue.

Note: The model of the <constraint> element is only looking at the child, not the deep descendants within Schematron rules.

peterstadler commented 6 years ago

I was just looking for an isoschematron schema to see what's actually allowed within <sch:assert> and found http://schematron.com/wp-content/uploads/2016/12/iso-schematron.rnc_.txt. There it says:

assert =
element assert {
attribute test { exprValue },
attribute flag { flagValue }?,
attribute id { xsd:ID }?,
attribute diagnostics { xsd:IDREFS }?,
attribute properties { xsd:IDREFS }?,
rich,
linkable,
(foreign & (text | name | value-of | emph | dir | span)*)
}

and foreign is defined as

foreign = foreign-attributes, foreign-element*
foreign-empty = foreign-attributes

foreign-attributes = attribute * - (local:* | xml:*) { text }*
foreign-element =
element * - sch:* {
(attribute * { text }
| foreign-element
| schema
| text)*
}

so I think @sydb is right that nothing should stop us from adding TEI tagdocs (and other?) elements here!

sydb commented 4 years ago

Council VF2F agrees ticket is GO for testing what happens if model.phrase.xml is allowed: i.e., do Stylesheets strip element, strip tags, or allow whole thing through? If output is good, ticket is GO to add model.phrase.xml to content of <sch:assert> and <sch:report>; if not, we probably need a Stylesheets ticket.