TEIC / TEI

The Text Encoding Initiative Guidelines
https://www.tei-c.org
Other
269 stars 88 forks source link

anyURI restrictions #2347

Closed rettinghaus closed 8 months ago

rettinghaus commented 1 year ago

In teidata.namespace the restriction for anyURI is "\S*" while in teidata.pointer it's "\S+". Is there a reason for this?

The W3C Recommendation states:

The empty string, though it is a legal URI reference, cannot be used as a namespace name.

So I guess the one for teidata.namespace is faulty?

sydb commented 1 year ago

Ummm … no. Or at least, it is not that simple. teidata.namespace is used as the datatype of (among other things) @ns of <attDef> and @name of <namespace>. In both those cases an empty string needs to be allowed. (Because either the attribute being defined is in no namespace, or the elements whose usage is being described are in no namespace.)

OTOH, it might make more sense to make teidata.namespace use the restriction "\S+" and add | "" to those two particular cases.

sydb commented 1 year ago

I have convinced myself that the expanded version of my musing above (that teidata.namespace is, as @rettinghaus suggests, in some sense “faulty”, and should be restricted to "\S+"; and att.namespaceable and @name of <namespace> should use <datatype minOccurs='0' maxOccurs='1'> around the data reference, and various elements like <attDef> should be members of att.namespaceable) is the way to go. So much so I have implemented it locally.

HOWEVER, the results are broken because of S-557, so although I have put the fix in a branch, I have not generated a PR, as it should not be merged in, yet.