BetaMasaheft / Documentation

Die Schriftkultur des christlichen Äthiopiens: Eine multimediale Forschungsumgebung
3 stars 3 forks source link

Encoding language-script when no direct match #2465

Closed eu-genia closed 8 months ago

eu-genia commented 8 months ago

I have not found anything 100% fitting or explaining this in TEI but I guess what is recommended is to use a separate language subtag for the script as an extention to the language tag (there is an example with az-Arab for Azeri in Arabic script as opposed to az-Latn for Azeri in Latin script or az-Cyrl for Azeri in Cyrillic, which would mean using ar-Ethi for Arabic in fidel or gez-Sarb for Ethiopic in Sabaic script or har-Ethi vs har-Arab for Harari written in fidel or in Arabic script

The syntax is always main language tag in small letters - (hyphen) script subtag first letter capitalized (can be followed by the region subtag, all capitals). Some subtags are listed here, one can also do the search here The latest link can also check your tag-subtag combination for you.

See also

(there is also the@style attribute one can use after xml:lang in TEI as described, but it seems to be suggested for modes of writing, orientation etc. so xml:lang="ar" style="script: Ethiopic" could be possible technically but seems to be a creative interpretation of TEI so I would possibly prefer the first option)

_Originally posted by @eu-genia in

FYI @thea-m @DenisNosnitsin1970 @CarstenHoffmannMarburg @abausi I can try this out and eventually add to the Guidelines

eu-genia commented 8 months ago

NB the edited inscriptions should be corrected, where relevant, to xml:lang="gez-Sarb" type="normalized" (since they are provided in XML files in Romanization not in direct transcription)

eu-genia commented 8 months ago

done, hope all is clear )