Open bazingarj opened 5 years ago
The transliteration between scripts, like Devanagari to Latin in this case, is performed by the ICU library which uses the data of the Unicode CLDR.
The Devanagari-Latin transform internally transforms to InterIndic first and afterwards from InterIndic to Latin.
Taking “अब” for example, you can see that “अ” gets transformed to \uE005
in Devanagari-InterIndic.xml:20 and “ब” to \uE02C
in Devanagari-InterIndic.xml:59.
The Codepoints \uE005
and \uE02C
get assigned to $wa
in InterIndic-Latin.xml:21 and $ba
in InterIndic-Latin.xml:60.
And finally $wa
to “a” in InterIndic-Latin.xml:446 and $ba
to “ba” in InterIndic-Latin.xml:298.
In short:
अ -> \uE005 -> $wa -> a
ब -> \uE02C -> $ba -> ba
As I have no knowledge about Devanagari I can’t spot at which point the transformations are wrong.
It would be great, if you can file a ticket directly at the CLDR: http://cldr.unicode.org/index/bug-reports
You can reproduce the issue with a single line of PHP code:
echo \Transliterator::create('Deva-Latn')->transliterate('अब');
मिठाई - mithai (coming up as mitha-i) खुशबू - khushbu ( coming up as khasaba) लेना - lena ( coming up as lana) पैसे - paise (comping up as pasa) अब - aba (must be ab)