TEIC / TEI

The Text Encoding Initiative Guidelines
https://www.tei-c.org
Other
270 stars 88 forks source link

native IDREF vs. URI matching regex "\S+" #2296

Closed mathias-goebel closed 2 years ago

mathias-goebel commented 2 years ago

I recently upgraded from TEI 4.3.0 to 4.4.0. I suspect a change related to the @xml:id data type as i observe the following. Oxygen conveniently provides list of already present values of @xml:id when adding an IDREF typed attribute. This helpful list disappeared with version 4.4.0. I can reproduce the issue taking a sample document that contains something like tei:handNote/@xml:id within the tei:teiHeader and adding a @scribeRef somewhere within tei:text. With version 4.3.0 (<?xml-model href="https://tei-c.org/Vault/P5/4.3.0/xml/tei/custom/schema/relaxng/tei_all.rng"?>) the list will be provided, pointing to 4.4.0 it is missing.

mathias-goebel commented 2 years ago

because i got no response so far here, i did another test, just to make sure the issue is reproducible. the newly appeared error message, that likely prevents Oxygen from its regular IDREF parsing is value of attribute "hand" is invalid; must be a URI matching the regular expression "\S+".

That message appears for this document, where Verison 4.4.0 is used and no auto-completion list is shown in Oxygen:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="https://tei-c.org/Vault/P5/4.4.0/xml/tei/custom/schema/relaxng/tei_all.rng" type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title>test file</title>
        <author xml:id="mg">mathias goebel</author>
      </titleStmt>
      <publicationStmt>
        <ab>test file for schema issue <ref target="https://github.com/TEIC/TEI/issues/2296">#2296</ref></ab>
      </publicationStmt>
      <sourceDesc>
        <ab>born digital</ab>
      </sourceDesc>
    </fileDesc>
  </teiHeader>
  <text>
    <body>
      <p hand="#mg">
        Test.
      </p>
      <p hand="">
        test 2
      </p>
    </body>
  </text>
</TEI>

Switching back to 4.3.0 the error disappeared and the list will become available. If i had to choose between both options, i would vote for the auto-completion feature.

suggestion: let schematron take care for additional patterns.

sydb commented 2 years ago

I have duplicated the problem. Interesting point.

Never came up (for me) in testing, because I typically leave the “Check ID/IDREF” box unchecked because of the problem (which I consider an error in oXygen, but I do not think SyncRO Soft agrees) described in Appendix 1. Technical Problem: Conflicting ID-Types.

This ticket should be reviewed and assigned on Fri evening CEST.

P.S. Love the Minions fellow as your pic.

peterstadler commented 2 years ago

I haven't noticed this ticket until @sydb responded but came across that issue, too. I posted to https://www.oxygenxml.com/forum/tei/topic24291.html and again the Oxygen people were very helpful and provided some tipp.

mathias-goebel commented 2 years ago

That does not make it easier for our use case, where we are not using the framework (just pointing to our customized RNG based on the latest version of TEI, i suspect the framework is not involved here) and where updating Oxygen is not possible for the project. we either have to stick to TEI 4.3.0 or ask for a fix by TEI, maybe a revert for the pattern constraint. But i agree that being backwards-compatible with Oxygen might not get highest priority. I am looking forward to the outcome of your discussion.

sydb commented 2 years ago

Some thoughts.

  1. It is difficult to do this in Schematron, but more importantly, doing so makes a constraint that is very hard to maintain. That is because there are 62 different places xsd:anyURI is used singularly and thus would need to be included in the @context. And whenever that list of places changes, the @context has to be changed to match.
  2. I have some faith that the oXygen folks will fix this on their end, or that we can leverage the generic content completion feature to make it work.
  3. To be honest, @mathias-goebel, you are not going to get much sympathy from me for the desire to stick to tei_all 4.3.0 or 4.4.0 (or any other particular version). I think every project should create a customized version of TEI for themselves. And reverting to the “spaces allowed in a URI, but at least oXygen will create a pop-up” behavior is quite easy. Just include the following snippet in your customization ODD.
    <dataSpec module="tei" ident="teidata.pointer" mode="change">
      <content>
        <dataRef name="anyURI"/>
      </content>
      <remarks>
        <p>Unlike vanilla TEI, we allow spaces in a URI (and thus
        give up the capability to differentiate a single URI with
        a space from 2 URIs) in order to get ID pop-ups to work in
        oXygen.</p>
      </remarks>
    </dataSpec>
mathias-goebel commented 2 years ago

Wow, thank you! I was not aware of dataSpec at all. Never had the desire to use it! I will include in our ODD. Usage of tei_all here was for demo purpose only.

raducoravu commented 2 years ago

Just for completion, someone reported this issue also on the Oxygen forum: https://www.oxygenxml.com/forum/viewtopic.php?p=65900#p65900 George Bina gave there a possible workaround and we'll also try to improve things on our side so that id/idrefs content completion starts working again in Oxygen 25.0 (Autumn this year) with the TEI 4.4.0 schemas, but I'm afraid older Oxygen versions will continue to have this problem.

sydb commented 2 years ago

Excellent, thank you @raducoravu, and good luck @mathias-goebel! Closing. If there are still problems, feel free to re-open or start a new one. :smile:

raducoravu commented 1 year ago

We released Oxygen 25 which should present content completion again for these constructs without the need to make any changes to the schemas.