davidpomerenke / alphabetify

Learn a new alphabet by reading a good text in your native alphabet with more and more foreign letters.
https://alphabetify.js.org
GNU General Public License v3.0
2 stars 1 forks source link

Use pronounciation for transliteration #8

Open davidpomerenke opened 4 years ago

davidpomerenke commented 4 years ago

While transliterating letter-by-letter works nicely for German → , most users appear to find it unintuitive for English → .

There exists a tool for retrieving the international phonetic alphabet (IPA) version of an English word: https://github.com/shukriadams/node-text-to-ipa

The main work would be to rewrite the transliteration rules for English → * using the IPA characters as source characters. There's 107 characters + diacritics, so this will get really complex. I don't know whether Regexes work well with IPA characters.

davidpomerenke commented 4 years ago

An advantage of using the IPA as source characters would be that it would then no longer be necessary to distinguish between different source languages. And probably, the source part of rules would no longer need to contain multiple characters, and rules would no longer need to be prioritized. (This would bring no performance improvement, however, as the prioritization happens during preprocessing.)

However, this presupposes that there are suitable IPA dictionaries available for all relevant languages (only German, so far). The package mentioned above only includes an IPA dictionary for American English, and they mention in https://github.com/surrsurus/text-to-ipa that it was hard even to find this.