rhdunn / cainteoir-engine

The Cainteoir Text-to-Speech core engine
http://reecedunn.co.uk/cainteoir/
GNU General Public License v3.0
43 stars 8 forks source link

Support transliteration of Chinese characters (pinyin) #58

Open rhdunn opened 10 years ago

rhdunn commented 10 years ago

The Chinese character transliteration is based around the pinyin transliteration system. The data for this is in the Unicode Character Data files and the extraction of the pinyin transcriptions should be done by the ucd-tools project.

Specifically, transcriptions for Mandarin, Cantonese and Japanese pronunciations of the Chinese characters should be supported.

In addition, the pinyin pronunciations should have two pronunciation modes:

  1. phonetic/IPA -- an accurate IPA-based phonetic transcription;
  2. Latin/English -- an English approximation of the Chinese.

To be complete, this requires improving the phoneme model to support tone markers.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/1359254-support-transliteration-of-chinese-characters-pinyin?utm_campaign=plugin&utm_content=tracker%2F254961&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F254961&utm_medium=issues&utm_source=github).