TEI4HTR / page2tei

A repository for illustrating the transformation of a PAGE XML file into XML-TEI format, resulting from experimentations made for the LECTAUREP project.
Creative Commons Attribution 4.0 International
15 stars 2 forks source link

Tagging image metadata inside a facsimile element #9

Open HugoSchtr opened 2 years ago

HugoSchtr commented 2 years ago

Image metadata is currently tagged within the <sourceDoc> element with <graphic>.

<sourceDoc>
      <graphic url="FRAN_0025_3056_L-0.jpg" width="2894px" height="4393px"/>
      <surfaceGrp>
         <surface xml:id="eSc_textblock_afbab800"
                  type="structure_{type:col_1;}"
                  points="421,615 421,2236 465,2211 465,2266 421,2269 425,2449 410,4148 362,4213 205,4228 234,615">
            <zone xml:id="eSc_line_86b00a8e"
                  type="mask"
                  points="285,838 293,812 322,798 380,801 377,863 289,874">
               <path type="baseline" points="289,841 389,845"/>
               <line>198</line>
            </zone>
            ...

Instead, and for the sake of clarity, image metadata can be tagged inside the <facsimile> element:

<facsimile>
      <graphic url="FRAN_0025_3056_L-0.jpg" width="2894px" height="4393px" xml:id="FRAN_0025_3056_L-0"/>
</facsimile>
<sourceDoc>
      <surfaceGrp facs="#FRAN_0025_3056_L-0">
         <surface xml:id="eSc_textblock_afbab800"
                  type="structure_{type:col_1;}"
                  points="421,615 421,2236 465,2211 465,2266 421,2269 425,2449 410,4148 362,4213 205,4228 234,615">
            <zone xml:id="eSc_line_86b00a8e"
                  type="mask"
                  points="285,838 293,812 322,798 380,801 377,863 289,874">
               <path type="baseline" points="289,841 389,845"/>
               <line>198</line>
            </zone>
            ...

Image metadata and transcription data would then be separated in their respective elements. With appropriate xml:id and facs attributes, multiple images could be encoded with a single TEI file.

gabays commented 2 years ago

I would put <graphic> in <sourceDoc>, because in some cases you would need to use @source. The original image, may it be referenced in <graphic> or @source, would therefore be in the same element <sourceDoc> rather than sometimes in <sourceDoc>, sometimes in <facsimile>. No? Cf.

<surfaceGrp xml:id="Epithalame1687_0004" type="page">
   <surface xml:id="Epithalame1687_0004_BT5_1"
      source="https://gallica.bnf.fr/iiif/ark:/12148/bpt6k57078011/f0004/330,1369,198,187/full/0/native"
      corresp="#BT5" ana="#decoNote"/>
      <zone xml:id="Epithalame1687_0004_BT2_2_LT1_1" corresp="#LT1"
         points="404 953 404 897 427 878 …"
         source="https://gallica.bnf.fr/iiif/ark:/12148/bpt6k57078011/f0004/402,878,1321,150/full/0/native">
         <path xml:id="Epithalame1687_0004_BT2_2_LT1_1_1" points="406 973 1725 978"/>
         <line>du texte</line>
      </zone>
</surfaceGrp>