rhdunn / cainteoir-engine

The Cainteoir Text-to-Speech core engine
http://reecedunn.co.uk/cainteoir/
GNU General Public License v3.0
43 stars 8 forks source link

Support script (Cyrillic, Greek, etc) spelling and transliteration #56

Open rhdunn opened 10 years ago

rhdunn commented 10 years ago

For scripts like Cyrillic and Greek, there are two processing modes:

  1. spelling -- each character is pronounced as the character name (e.g. in mathematics, Greek characters are spelled out);
  2. transliteration -- sequences of script characters are transliterated according to transliteration rules

As a general rule, if the native/dictionary script is Latin:

  1. if there is a single non-Latin character, then that character is spelled.
  2. spelling of single Latin characters is context/language dependent (e.g. 'y' in Spanish).
  3. sequences on non-Latin characters (e.g. Russian/Bosnian/Serbian names) are transliterated.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/1359256-support-script-cyrillic-greek-etc-spelling-and-transliteration?utm_campaign=plugin&utm_content=tracker%2F254961&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F254961&utm_medium=issues&utm_source=github).