Closed deepestblue closed 10 months ago
BTW, if it's NFC/NFD related, the issue is likely much bigger than Tamil
Thanks. Yes, this is related to Unicode normal forms, so this issue applies generally throughout vidyut-lipi
.
Much of vidyut-lipi's character mapping data comes from the indic-transliteration
project, so this error likely affects that family of transliterators as well.
saulabhyaJS also does not maintain separate NFC/NFD data, but IIRC (it's been a while) normalises input to NFC before looking up in the data.
Thanks, this is now fixed locally by making the code NFC/NFD aware. It needs more testing, but I think it's off to a good start.
Pushed and deployed to our online demo. vidyut-lipi
supports basic NFC/NFD mapping with limited support for input that is not in NFC/NFD (e.g. if multiple combining signs are ordered badly).
./lipi -t tamil -f iso19519 "ō"
Expected:
ஓ
Actual:ஒ̄
I think the issue may have to do with Unicode normal forms (NFD, NFC, etc.) I'm not 100% sure my testing was accurate, but using U+014D works correctly, but not U+006FU+0304.