srophe / syriaca-data

Repository for Syriaca.org TEI data, used by srophe-eXist-app.
4 stars 16 forks source link

Revise encoding method for paratextual material in BHSE records #969

Open wlpotter opened 2 years ago

wlpotter commented 2 years ago

Currently BHSE records' prologue, incipit, and explicit ('paratextual materials') are encoded in a tei:note. Transcriptions of the Syriac and translations (into French, primarily, with a few cases into English) are treated as separate tei:note elements with a nested tei:quote. They are only related to one another by sharing the same @type value:

 <note xml:lang="syr" type="incipit" source="#bib234-1">
  <quote>ܫܢܬܐ ܗܘ̣ܬ ܕܬܪܬܝܢ ܕܪܕܘܦܝܐ: ܘܩܪܒܐ ܕܠܩܘܒܠܢ ܝܬܝܪ ܡܢ ܩܕܡܝܐ ܥܫܢ ܗܘ̣ܐ.</quote>
</note>
<note xml:lang="fr" type="incipit" source="#bib234-1">
  <quote>C’était la deuxième année de la persécution, et le combat contre nous était plus intense que le premier.</quote>
</note>

This is not ideal for cases where we might want to record paratextual materials from multiple sources (e.g., variants from differing editions). It seems we want an option to have multiple, separately-sourced instances of a given type of paratext and allow each of these instances to have multiple associated linguistic expressions.

Perhaps here we use a single tei:note with an @xml:id and multiple nested tei:quote elements to capture the various languages:

 <note type="incipit" source="#bib234-1" xml:id="incipit234-1">
  <quote xml:lang="syr">ܫܢܬܐ ܗܘ̣ܬ ܕܬܪܬܝܢ ܕܪܕܘܦܝܐ: ܘܩܪܒܐ ܕܠܩܘܒܠܢ ܝܬܝܪ ܡܢ ܩܕܡܝܐ ܥܫܢ ܗܘ̣ܐ.</quote>
  <quote xml:lang="fr">C’était la deuxième année de la persécution, et le combat contre nous était plus intense que le premier.</quote>
</note>

We could also use tei:anchor elements in a similar way as we did for Caesarea (cf. https://caesarea-maritima.org/testimonia/133.tei): <anchor xml:id="testimonia-133.grc.1" corresp="testimonia-133.en.1"/> if we want to more explicitly relate the two tei:quote elements.

We might also want to incorporate this practice more broadly in NHSL records?

wlpotter commented 2 years ago

Use the grouping of quote elements for different language expressions under the same note. Might as well give them xml:ids. But make sure to move the source to the tei:quote element instead (for cases where they might come from multiple sources or we have a @resp used instead).

For LOD we may need to think further about how these will be expressed as related to one another. Do we assume they are sources of each other? or the one that matches the ../textLang will be the source for the others? Maybe a @type for 'translation' vs 'transcription'?