Closed markalex2209 closed 3 weeks ago
Let's wait a bit. I'm also checking combinations of consonant + j
+ vowel, like mju
. I think they all need to be transliterated with soft sign between consonant and vowel, but I need to check.
Checked, confirmed. Happy with results. Ready for review
Just as a reflection. I originally wanted to add an existing package for transliterations and I tried a couple. But I immediately realized that none of them handle Latvian language and all those š and ž letters with diacritics don't get transliterated at all. So I started making my own. But when I got to the point having to think about consonants/vowels and combinations, I just stopped caring... So that's why I also added the approximate comparison.
I get you. And I think not using library here is wise, since without it we are able to tweak process as we like (For example, combination of jo
might require attention in future).
And I got that since you are not native speaker, this might be a bit too much. I'm glad to help were I can.
Regarding word distance - this is a correct call, because manually comparing strings will be a lot of useless headache. And I personally don't care if the most appropriate transliteration would be Саиета
but now it is Сайета
(or vice versa), etc.
So, I'd say your choices played out very well here.
A bunch of minor changes to ru transliteration checker:
х/г
andа/я
Эй
an expected transliteration forEi
in the beginning of wordšja/ā
to actually workņņ
)j
+ vowel:mju
->мью
notмю
Additionally, lowered "good enough" threshold to 0.5: I'm working my way through transliterations, and I hope to address most of them that actually differ in at least a letter.