rhdunn / cainteoir-engine

The Cainteoir Text-to-Speech core engine
http://reecedunn.co.uk/cainteoir/
GNU General Public License v3.0
43 stars 8 forks source link

Support reading currencies #42

Open rhdunn opened 11 years ago

rhdunn commented 11 years ago

Currencies in Unicode have the Sc character class. This should be split from the generic punctuation event type and put into a currency event type.

Each currency symbol has a singular and plural form (for n=1 and n>1 respectively, e.g. = "pound", _£s = "pounds"). NOTE: Need to investigate whether other languages have other forms.

NOTE: Some currency symbols are in the Ll character class (e.g. "25p").

Currencies have a whole number part (e.g. dollars) and a corresponding fractional part (e.g. cents). These are used when reading fractional numbers (e.g. $2.55 is read as "two dollars and fifty five cents").

The currency symbol can occur before (e.g. £6) or after (e.g. 62p) the number. Usually it is the fractional currency that occurs after, but European countries place the Euro sign after the number.

Some text places both the currency symbol and its name together (e.g. $4 dollars). Here, most text-to-speech programs read the currency name twice. The tts/context_analysis.cpp code should detect this and avoid duplicating the currency name.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/1026781-support-reading-currencies?utm_campaign=plugin&utm_content=tracker%2F254961&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F254961&utm_medium=issues&utm_source=github).