clarin-eric / ParlaMint

ParlaMint: Comparable Parliamentary Corpora
https://clarin-eric.github.io/ParlaMint/
41 stars 52 forks source link

Alignment of transcriber comments #693

Closed TomazErjavec closed 10 months ago

TomazErjavec commented 1 year ago

For the MTed corpora alignment of sentences (and some superordinate elements) is taken care of. However, transcriber comments (notes and heads and desc of the various incidents) are not aligned. The main reason was that these elements in the original do not have an ID, so they can't be aligned. For 3.1 it would be good if add-common-content would also give IDs to these elements if missing, and the MTed TEI corpora would also align these elements.

TomazErjavec commented 10 months ago

Transcriber comments are now also aligned, e.g.

<note type="chairpersons"
      xml:id="ParlaMint-AT_2005-04-27-022-XXII-NRSITZ-00108.ana.note2"
      corresp="mt-src:ParlaMint-AT_2005-04-27-022-XXII-NRSITZ-00108.ana.note2"
      xml:lang="en">Second President Mag. Barbara Prammer</note>

So, closing.