Open eroux opened 5 years ago
The situation for Chinese is a bit different as zh/zho is a macro language so it's not supposed to be specific (modern vs. classical)
Do the dialects manifest in the writing as spelling differences or script differences?
phonology, spelling, syntax, vocabulary, etc. but not as scripts differences I would say
well, while the addition of a few dialects in the ontology is welcome, the main issue was "should we use bo
or xcr
as a string tag?", and perhaps also "should we make a distinction between modern, classical and old Tibetan in our Ontology?", these questions are yet to be addressed I believe
I meant to indicate that the commit was some progress on the issue not the final word. We need to discuss the other tags, xct and otb.
Maybe the people in India w/ NT can deal with clarifying the actual language of various texts, the dialects are useful if there are people that can detect them from the written content.
There are several ISO lang tags corresponding to Tibetan:
bo
,tib
andbod
: Tibetan (Standard Spoken Tibetan, "Living" language type here)xct
: Classical Tibetan ("Historical" language type here)otb
: Old Tibetan (same here)plus some dialects:
tsk
: Tsekukhg
: Khams Tibetankbg
: Khambaadx
: Amdo TibetanI think most our data is in fact
xct
. I'm not advocating to change all our data (although it would be a relatively easy change), I'm just thinking maybe this should be discussed and the decision (which could be an engineering decision) should be recorded in the lang tag document.