dmort27 / epitran

A tool for transcribing orthographic text as IPA (International Phonetic Alphabet)
MIT License
625 stars 120 forks source link

Add Japanese (Katakana, Hiragana) #142

Closed lart-rt closed 1 year ago

lart-rt commented 1 year ago

Added “map” and “post” files for Katakana and Hiragana in Japanese.

First, we created the files for Katakana. And then, we produced the files for Hiragana by replacing Katakana in the previous files with Hiragana because the two writing systems are in one-to-one correspondence.

The files without the suffix “-red” are based on the descriptions in the following literature about linguistics and phonetics of Japanese:

The files with “-red” are more “reduced” version based on the description in the following articles in Wikipedia.

This commit DOESN’T support Kanji (Chinese characters) in Japanese because how to read it is highly dependent on context and the number of Kanji is much larger than those of Katakana and Hiragana.

lart-rt commented 1 year ago

Some commits are behind master branch, so I will resend a pull request after solving it.