iljackb / Mixtepec_Mixtec

Mostly XML (TEI) markup of Mixtepec-Mixtec Language resources
3 stars 1 forks source link

Which form to annotated in standoff #87

Open iljackb opened 4 years ago

iljackb commented 4 years ago

Laurent, I seem to recall you telling me that when annotating grammar and other content using @ana that I should not put # in front of the tag unless the features are declared in the same document is that correct?

This is what I had been doing:

            <spanGrp type="gram">
               <span type="pos" target="#d1e95 #d1e98" ana="#V"/>
               <span type="transitivity" target="#d1e95 #d1e98" ana="#INTRANS"/>
               <span type="person" target="#d1e95 #d1e98" ana="#2PERS"/>
               <span type="number" target="#d1e95 #d1e98" ana="#SG"/>
               <span type="register" target="#d1e95 #d1e98" ana="#INF"/>
            </spanGrp>

On a related note, I was annotating something last night and I had the urge to use the fully written values of the linguistic features (with the exception of number), eg.

            <spanGrp type="gram">
               <span type="pos" target="#T-orth2.83" ana="verb"/>
               <span type="transitivity" target="#T-orth2.83" ana="transitive"/>
               <span type="macrorole" target="#T-orth3.26" ana="actor"/>
               <span type="macrorole" target="#T-orth3.58" ana="undergoer"/>
               <span type="pos" target="#T-orth3.26" ana="enclitic"/>
               <span type="person" target="#T-orth3.26" ana="1st"/>
               <span type="number" target="#T-orth3.26" ana="sg"/>
               <span type="pos" target="#T-orth3.58" ana="noun"/>
               <span type="pos" target="#T-orth4.00" ana="adposition"/>
               <span type="pos" target="#T-orth4.16" ana="noun"/>
            </spanGrp>

Do you think one is better than the other? Since I can put the values as suggested values in the ODD schema it won't necessarily take more time..the alternative would thus be:

            <spanGrp type="gram">
               <span type="pos" target="#T-orth2.83" ana="V"/>
               <span type="transitivity" target="#T-orth2.83" ana="TRANS"/>
               <span type="macrorole" target="#T-orth3.26" ana="A"/>
               <span type="macrorole" target="#T-orth3.58" ana="U"/>
               <span type="pos" target="#T-orth3.26" ana="ENCLT"/>
               <span type="person" target="#T-orth3.26" ana="1PERS"/>
               <span type="number" target="#T-orth3.26" ana="SG"/>
               <span type="pos" target="#T-orth3.58" ana="N"/>
               <span type="pos" target="#T-orth4.00" ana="ADPOS"/>
               <span type="pos" target="#T-orth4.16" ana="N"/>
            </spanGrp>
laurentromary commented 4 years ago

@ana is a https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-teidata.pointer.html attribute and should thus always comply to the URI syntax. I am not asking you to read the corresponding standard ;-): https://www.ietf.org/rfc/rfc3986.txt but search for '#' in there and see for instance the general syntax description 3 (precise explanation in 3.5). You'll see that #xxxx marks a fragment identifier where the corresponding context can be made implicit. Long story short: you need to put '#' everywhere.

iljackb commented 4 years ago

Ok got it. So should I stick with the abbreviated tags too then also?

laurentromary commented 4 years ago

Yes, but make sure that they resolve nicely (by default in the same document, unless you have an @xml:base declaration in an ancestor node.