Open sophlew opened 1 year ago
For tagging the language of the text contained in a particular element, users should add the xml:lang
attribute to that element , like so:
<line xml:lang="fr">VOCABULAIRE</line>
Here, the xml:lang
attribute is attached to a line
element, but it can go on any element, and applies to all the text contained in that element.
The language
element shouldn't appear in the transcription at all; it's only for use inside the teiHeader
, within a langUsage
element which lists the languages appearing in the transcription. There, the ident
attribute identifies the language itself, and the xml:lang
attribute would identity the language in which the language is named e.g. here's the French language with the English name "French":
<language ident="fr" xml:lang="en">French</language>
or, the Spanish name for the French language:
<language ident="fr" xml:lang="es">francés</language>
There is also a lang
element which can be used inside the transcription, to tag the names of languages (in the same way that e.g. persName
can tag the name of a person mentioned in the text), e.g.
<lang>French</lang>
Generally, the main language of a TEI text would be tagged (with an xml:lang
attribute) on the text
element of the document. This would indicate that the entire text is in that language except where over-ridden by xml:lang
attributes attached to individual elements within the text
. In the Nyingarn workspace no-one gets to see the text
element because they're transcribing within surface
elements which each represent just one page, and the text
element is created as a wrapper only when the TEI file is exported.
But the workspace does allow someone to mark the main language of a document in a metadata-entry form, I think, and strictly, this should end up encoded in xml:lang
when the entire TEI file is reconstituted. I am pretty sure it doesn't do this currently, though.
text/@xml:id
when a TEI document is exportedI've been doing this for the Italian in New Norcia 38
Ultimately it would be good to have the language word encoded too.
On Mon, 1 May 2023 at 13:12, Conal Tuohy @.***> wrote:
External email: Please exercise caution *
Generally, the main language of a TEI text would be tagged (with an xml:lang attribute) on the text element of the document. In the Nyingarn workspace no-one gets to see the body element because they're transcribing within surface elements which each represent just one page. But the workspace does allow someone to mark the main language of a document in a metadata-entry form, I think, and this should end up as xml:lang attributes when the entire TEI file is reconstituted. I am pretty sure it doesn't do this currently, though.
- Check that a document's main language is converted to a @.***:id when a TEI document is exported
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>
I think if the text as a whole is tagged with the indigenous language, then tagging all the Italian words as exceptions will mean the whole text is tagged correctly.
Example from Amy here, DumontDurville_1834-381. How should the French language be marked up? Screen shot shows the current error message