Closed Natkeeran closed 1 year ago
Please provide a mapping table for ISO 15919 romanized text to Tamil unicode text.
check here for existing mapping tables. https://github.com/Ezhil-Language-Foundation/open-tamil/blob/main/tamil/txt2unicode/encode2utf8.py
if we get the mapping, we can add them to open-tamil easily.
@tshrinivasan
There is a mapping from Tamil to ISO 15919 (https://en.wikipedia.org/wiki/ISO_15919) in Open Tamil (https://github.com/Ezhil-Language-Foundation/open-tamil/issues/233). Here, need the reverse ISO 15919 -> Tamil Unicode.
This converter comes close: http://aksharamukha.appspot.com/converter.
Simple solution:
When the following code works to transliterate into English
from transliterate import azhagi, jaffna, combinational, UOM, ISO, itrans, algorithm
ISO_table = ISO.ReverseTransliteration.table
expected = 'cāmi. citamparaṉār nūṟ kaḷañciyam'
tamil_str = "சாமி. சிதம்பரனார் நூற் களஞ்சியம்"
eng_str = algorithm.Direct.transliterate(ISO_table,tamil_str)
print(eng_str)
the succeeding code can be used to reverse the transliteration:
from transliterate import algorithm as tx_algo
rev_table = tx_algo.reverse_transliteration_table(ISO_table)
new_tamil_str0 = algorithm.Direct.transliterate(rev_table,eng_str)
print(new_tamil_str0)
however this is not sufficient, so we do the following,
new_tamil_str1 = algorithm.Iterative.transliterate(rev_table,eng_str)
print(new_tamil_str1)
as executed on Colab with Open-Tamil v1.1
Tamil ISO 15919 standard is often used to convert Tamil text to romanized text. This specially the case in many cataloguing systems. Example: https://catalog.hathitrust.org/Record/6133883. Generaly Tamil public is not familiar with this standard. Is there any tool that can take the ISO 15919 romanized text and convert it into Tamil unicode text? (https://www.ushuaia.pl/transliterate/?ln=en in reverse)