virtualvinodh / aksharamukha-python

Aksharamukha Python Library
GNU Affero General Public License v3.0
43 stars 15 forks source link

Retain ऱ and ऩ in tamiL with transcribe tamil option #21

Open vvasuki opened 1 month ago

vvasuki commented 1 month ago

What I almost always want from aksharamukha is to use the "best" devanagari transliteration from tamil input. "Best" is

Conversion to a richer script should encode more, not less information.

So, I want उऱुदल् मुऱ्ऱिऱ्ऱु, rather than उऱुदल् मुट्रिट्रु. Could you provide this facility (perhaps under the Transcribe Tamil (Dialectal) option)? Please note that there is no \u200c inserted here, as it is now.

In the same vein, I'd like अगऩ्ऱ rather than अगण्ड्र.

vvasuki commented 1 month ago

Unfortunately, can't even recover the superior transliteration by post-hoc replacement - ஏமாற்று ēmāṟṟu is transliterated to एमाट्रु, and not एमाट्र््ट्रु :-( It's like switching to a superior script and loosing information.

EDIT - looks like I was mistaken - I can atleast recover ṟṟ by replacing ट्र