dracor-org / dracor-schema

ODD and schemas for dracor.org files
https://dracor.org/doc/odd
5 stars 2 forks source link

link tei:castItem and tei:person #34

Open ingoboerner opened 3 years ago

ingoboerner commented 3 years ago

Currently, in DraCor in-house corpora we include the tei:castItem in a tei:castList that contains the name of a character and his/her roleDescription ... in addition, to link speech acts to speakers/characters, the files contain a tei:listPerson with tei:person that are identified by an @xml:id. But castItem and corresponding person are not explicitly linked (via an attribute). If one want's to get the "label" of a character as contained in the text, this can only be done via complicated string comparisons and fuzzy matching (?), becausecharacters and corresponding tei:person are not in the same order, so the link between these two elements is not even somehow implicit. Maybe, it should be made explicit (@corresp) to provide for an easier way to retrieve the information from the tei:castItem via the @xml:id of the tei:person in the tei:particDesc

Example of current encoding: goethe-egmont

 <div type="Dramatis_Personae">
        <castList>
          <head>Personen.</head>
  <!-- ... -->
          <castItem>Herzog von Alba</castItem>
<!-- ... -->
</castList>
</div>

and

<particDesc>
        <listPerson>
<!-- ... -->
<person xml:id="alba" sex="MALE">
            <persName>Alba</persName>
          </person>
<!-- ... -->
</listPerson>
</particDesc>
ingoboerner commented 3 weeks ago

I just saw that there is already a hint in <remarks> at least in <castGroup> that suggests using the attribute @corresp. I am not sure if there is already support for this encoding or if we agreed to do it this way. maybe @cmil you can help me out with this:

<attDef ident="corresp" mode="change">
                                <datatype>
                                    <dataRef key="teidata.pointer"/>
                                </datatype>
                                <remarks>
                                    <p>Used to link a character in the dramatis personae to the
                                        corresponding <!-- person/group? --> element in the
                                            <gi>particDesc</gi></p>
                                </remarks>
                            </attDef>

Do you remember adding this @lehkost. It could have been me but I forgot ...

could include it in #67

lehkost commented 3 weeks ago

I remember that we talked about this and @corresp is definitely the way to go. I would not use this additional encoding for the corpora that I'm working on right now as there is no direct use (info in particDesc usually is enough to do network analysis, etc.). But we should allow (and even recommend) this to establish a semantic bridge between particDesc and castList.