w3c / pronunciation

Pronunciation Task Force deliverables
https://www.w3.org/WAI/APA/task-forces/pronunciation/
Other
20 stars 12 forks source link

Roman numerals #103

Open michael-roe opened 2 years ago

michael-roe commented 2 years ago

Documents often contain numbers written in Roman numerals (e.g. V for 5). Without additional markup, a string such as "LIV" is ambiguous as to whether it is the number 54, the name LIV, or the sequence of characters L I V. Speech synthesisers currently use heuristics to guess, and often get it wrong.

The current spec isn't clear if data-ssml-say-as-format="cardinal" can be applied to Roman numerals, or if we need a new format to handle Roman numerals.

alia11y commented 2 years ago

Thank you for your post @michael-roe . Roman numerals are certainly tricky when it comes to pronunciation. However, we can use properties such as 'sub' or 'say-as' to address this. Perhaps, we would add some examples of the Roman numerals in our document.

brennanyoung commented 2 years ago

Perhaps, we would add some examples of the Roman numerals in our document.

Yes please!

We keep running into this problem, although it's usually because we need a non-numeric interpretation of something which happens to be a roman numeral. The most common case we struggle with is "IV" announced as 4 when "intravenous" is what we're expecting.

Some examples which show how to handle both numeric and non-numeric cases of such strings would be extremely helpful.