Open manufrancis opened 3 years ago
Hmm, this is a somewhat complicated issue. The XML Recommendation suggests using @xml:lang
to indicate language and optional locale, rather than script. So for example, es
is Spanish, es-ES
is Spain Spanish, and es-MX
is Mexican Spanish. We probably won't run into issues with this since the locale is always in all-caps, while the ISO script abbreviations we're using aren't. It's worth thinking about though.
Another issue that we might have with this is that it's not quite correct to tag something as ta-Gran
if we're really transliterating the text; it should rather be ta-Latn
. This is problematic because we have researchers who are actually inputting their text in Tamil script, e..g., ta-Taml
, rather than giving a Latin transliteration. By tagging that correctly, we can easily switch between Tamil script and Latin transliteration. But if we start using ta-Gran
to mean "Tamil Grantha transliterated into Latin script", then we can't use that system for records that are not transliterated (Jean-Luc would probably be unhappy).
It might actually be better to use something like @rend="grantha"
, as specified in the DHARMA encoding guide, and save ta-Gran
to mean when the tagged text is in untransliterated Grantha script.
Fine! Let us go with @rend="grantha". Could you implement this in the editor and have it displayed in bold?
bold in both Latin transliteration and Tamil Grantha display, right?
Following Axelle's advice, let us use
<seg xml:lang/>
instead of<g xml:lang/>
I guess we will thus use
<seg xml:lang="ta-Gran"/>
for 1.<seg xml:lang="sa-Gran"/>
for 2.Does it make sense?
PS: in bot cases, the display intransliteration should be bold.