buda-base / owl-schema

BDRC Ontology Schema
11 stars 2 forks source link

Tibetan language codes? #98

Open eroux opened 5 years ago

eroux commented 5 years ago

There are several ISO lang tags corresponding to Tibetan:

plus some dialects:

I think most our data is in fact xct. I'm not advocating to change all our data (although it would be a relatively easy change), I'm just thinking maybe this should be discussed and the decision (which could be an engineering decision) should be recorded in the lang tag document.

eroux commented 5 years ago

The situation for Chinese is a bit different as zh/zho is a macro language so it's not supposed to be specific (modern vs. classical)

xristy commented 5 years ago

Do the dialects manifest in the writing as spelling differences or script differences?

eroux commented 5 years ago

phonology, spelling, syntax, vocabulary, etc. but not as scripts differences I would say

eroux commented 5 years ago

well, while the addition of a few dialects in the ontology is welcome, the main issue was "should we use bo or xcr as a string tag?", and perhaps also "should we make a distinction between modern, classical and old Tibetan in our Ontology?", these questions are yet to be addressed I believe

xristy commented 5 years ago

I meant to indicate that the commit was some progress on the issue not the final word. We need to discuss the other tags, xct and otb.

Maybe the people in India w/ NT can deal with clarifying the actual language of various texts, the dialects are useful if there are people that can detect them from the written content.