Closed nivaca closed 1 week ago
I can confirm this: using your ODD I get the same problem. Also, the generated schema does not specify a root element, which seems odd (ha ha)
Something is definitely broken. Produce a relaxng schema from just about any odd (i tried tei_bare) and it contains duplicate declarations for @calendar and no root element. Is there no regression testing for stylesheet releases?
We also ran into this unexpected proliferation of RNG contents and had a closer look (on the surface, not on the level of the generating stylesheets).
Here is an MRE based on smtrad.xml
:
ODD
The three referenced modules in conjunction with the two element specifications result in the generation of duplicate attribute references and attributes in the RNG:
rng
Furthermore, the intermediary .odd.processedodd
is invalid due to misplaced and spurious elements. This suggests that the cause of the faulty RNG lies in the first part of the ANT task (odd2odd.xsl
).
.odd.processedodd
Pendant ce temps-là, dans le bureau du rez- de-chaussée, les secrétaires faisaient du
The global
This element is intended for use only where no other element is available to mark the phrase or words concerned. The global
The
L'attribut global
Cet élément n'est utilisé que lorsqu'il n'y a pas d'autre élément disponible pour baliser l'expression ou les mots concernés. L'attribut global
L'élément
El atributo global
Este elemento se reserva para los casos en que no hay disponible otro elemento para marcar la frase o las palabras referidas. Es preferente el
uso del atributo global
El elemento
マークアップされた語句の言語を特定するには、グローバル属性
当該要素は、当該語句に他の要素が使えない時にのみ使用されることが期 待されている。グローバル属性
要素
Das globale
Das
Das
Pendant ce temps-là, dans le bureau du rez- de-chaussée, les secrétaires faisaient du
The global
This element is intended for use only where no other element is available to mark the phrase or words concerned. The global
The
L'attribut global
Cet élément n'est utilisé que lorsqu'il n'y a pas d'autre élément disponible pour baliser l'expression ou les mots concernés. L'attribut global
❌
L'élément
El atributo global
Este elemento se reserva para los casos en que no hay disponible otro elemento para marcar la frase o las palabras referidas. Es preferente el
uso del atributo global
El elemento
マークアップされた語句の言語を特定するには、グローバル属性
当該要素は、当該語句に他の要素が使えない時にのみ使用されることが期 待されている。グローバル属性
要素
Das globale
Das
Das
Merging the two attDef
elements in a single elementSpec/attList
results in a valid intermediary ODD and RNG:
<elementSpec ident="foreign" module="core" mode="change">
<attList>
<attDef ident="xml:lang" usage="req" mode="change">
<valList type="closed" mode="add">
<valItem ident="ang">
<gloss>Anglo-Saxon</gloss>
<desc>Anglo-Saxon</desc>
</valItem>
</valList>
</attDef>
<attDef ident="ana" usage="opt" mode="change">
<valList type="closed" mode="add">
<valItem ident="lexeme">
<gloss>lexeme</gloss>
<desc>Treats the quote as a lexeme</desc>
</valItem>
</valList>
</attDef>
</attList>
</elementSpec>
I'd suggest having a look at (recent changes of) the rules/scopes in odd2odd.xsl
.
Thanks @nivaca for this detailed report! Really helpful for tracking down this bug.
I've compared the results from 7.55, 7.56, and 7.56a and can confirm that this is an issue introduced between 7.55 and 7.56 with respect to elementSpecs with the same @ident
(see also #645). Per @raffazizzi 's comment there, "this is a bug and not intentional" (https://github.com/TEIC/Stylesheets/issues/645#issuecomment-1840947279). @sydb also notes the fix in #645 "does not fix the duplicate attrs defined in Relax NG problem, which appears to have been introduced quite recently, and should be a separate ticket."
I think this is high priority, and should be fixed ASAP. From my testing, it appears that issue emerged from the changes to the tei:uniqueName
function — reverting that function to its previous state in 7.55 fixes the issue.
Running the current dev
Stylesheets yields the following schemaSpec as the results of pass0 (e.g: $ODD
) — here the two duplicate elementSpecs are retained:
<schemaSpec xmlns:teix="http://www.tei-c.org/ns/Examples" ident="SMTrad" source="https://www.tei-c.org/Vault/P5/current/xml/tei/odd/p5subset.xml" defaultExceptions="http://www.tei-c.org/ns/1.0 teix:egXML">
<moduleRef key="core"/>
<moduleRef key="tei"/>
<moduleRef key="header"/>
<elementSpec ident="foreign" module="core" mode="change">
<attList>
<attDef ident="xml:lang" usage="req" mode="change">
<valList type="closed" mode="add">
<valItem ident="ang">
<gloss>Anglo-Saxon</gloss>
<desc>Anglo-Saxon</desc>
</valItem>
</valList>
</attDef>
</attList>
</elementSpec>
<elementSpec ident="foreign" module="core" mode="change">
<attList>
<attDef ident="ana" usage="opt" mode="change">
<valList type="closed" mode="add">
<valItem ident="lexeme">
<gloss>lexeme</gloss>
<desc>Treats the quote as a lexeme</desc>
</valItem>
</valList>
</attDef>
</attList>
</elementSpec>
</schemaSpec>
Running the current stylesheets with the 7.55 version of tei:uniqueName
merges the two elementSpecs into a single one with two attLists instead:
<schemaSpec xmlns:teix="http://www.tei-c.org/ns/Examples" ident="SMTrad" source="https://www.tei-c.org/Vault/P5/current/xml/tei/odd/p5subset.xml" defaultExceptions="http://www.tei-c.org/ns/1.0 teix:egXML">
<moduleRef key="core"/>
<moduleRef key="tei"/>
<moduleRef key="header"/>
<elementSpec ident="foreign" module="core" mode="change">
<attList>
<attDef ident="xml:lang" usage="req" mode="change">
<valList type="closed" mode="add">
<valItem ident="ang">
<gloss>Anglo-Saxon</gloss>
<desc>Anglo-Saxon</desc>
</valItem>
</valList>
</attDef>
</attList>
<attList>
<attDef ident="ana" usage="opt" mode="change">
<valList type="closed" mode="add">
<valItem ident="lexeme">
<gloss>lexeme</gloss>
<desc>Treats the quote as a lexeme</desc>
</valItem>
</valList>
</attDef>
</attList>
</elementSpec>
</schemaSpec>
My best guess so far is that the problem is in the initial processing of *Spec/@mode='change'
:
https://github.com/TEIC/Stylesheets/blob/5a48e1595f0bf7d35c1dcd50993724cd346d4cd9/odds/odd2odd.xsl#L510-L532
In particular, I think it's these lines:
Where this could be true with the old uniqueName function (since TEI elements were identified by their local name — e.g. in this case, "foreign"), idents and uniqueNames are going to be different. Working on this in branch iss678_duplicateIdents
I've made this change in commit https://github.com/TEIC/Stylesheets/commit/8aefd67c482fea69e6bb8fcd5966f4329cd5a1b2 , which does produce the correct processed.odd and RNG from the sample ODD. However, tests are failing, so that requires further investigation
.rng
.processed.odd
Pendant ce temps-là, dans le bureau du rez- de-chaussée, les secrétaires faisaient
du
The global
This element is intended for use only where no other element is available to mark the
phrase or words concerned. The global
The
L'attribut global
Cet élément n'est utilisé que lorsqu'il n'y a pas d'autre élément disponible pour
baliser l'expression ou les mots concernés. L'attribut global
L'élément
El atributo global
Este elemento se reserva para los casos en que no hay disponible otro elemento para
marcar la frase o las palabras referidas. Es preferente el uso del atributo global
El elemento
マークアップされた語句の言語を特定するには、グローバル属性
当該要素は、当該語句に他の要素が使えない時にのみ使用されることが期 待されている。グローバル属性
要素
Das globale
Das
Das
I can confirm that the fix works in my ODD files. (I couldn't run the tests, however.) Thanks.
This is presumably not unrelated to #680
When running
teitorng
to convert an ODD schema to a RNG one, using the latest version (7.56.0) I get loads of "duplicate attribute" errors, e.g.I am running it standalone (from the bin/ directory in the latest cloned repo) and from the latest Oxygen XML Editor conversion utility (which I was told also uses v7.56.0) with the same issues.
I do not have this problem with v7.55.0 (run in either way).
My source ODD can be found here: https://github.com/nivaca/smtrad-schema/blob/main/smtrad.xml in case it is useful.