Speech-Rule-Engine / speech-rule-engine

Generating speech descriptions for XML structures
https://zorkow.github.io/speech-rule-engine/
Apache License 2.0
73 stars 39 forks source link

Space as thousands separator in numbers #724

Open limefrogyank opened 12 months ago

limefrogyank commented 12 months ago

I've done some testing and thought I would leave this issue here.

First, this is from the International System of Units spec concerning separating digits with spaces (emphasis mine):

"The practice of grouping digits in this way is a matter of choice; it is not always followed in certain specialized applications such as engineering drawings, financial statements and scripts to be read by a computer."

The International System of Units (PDF) (9th ed.). International Bureau of Weights and Measures. 2019. p. 150. ISBN 978-92-822-2272-0.

I take this to mean that the spacing has no meaning and is only to make it easier to read at a glance. Spoken numbers have natural separators like "thousand" and "million".

However, adding spaces to the number using Unicode character x2009 (slimspace) causes speech-rule-engine to add spaces to the number causing a number like 12345 (which looks like 12 345) to be read as "twelve three hundred and forty-five" instead of "twelve thousand three hundred and forty-five". Adding literal commas in place of the spaces will generate the correct reading, but using commas is not correct according to the SI rules.

This happens when:

Pure MathML using <mn>12\u2009345</mn> is the best because this does not generate any <mo> multiplication in MathSpeak. However, it is still not read properly in ClearSpeak.

I'm not sure what else I can try, but it seems that ideally we would have spaces between numbers in plain <mn> tags be treated the same way commas are treated.

As for a solution, I am generating the MathML directly and can explicitly add an attribute (data-number-separator = "\u2009" ??) easily. If there's a better way that's less of a bandaid, I'm happy to implement it on my side.

limefrogyank commented 12 months ago

Relevant: https://github.com/mathjax/MathJax/issues/2772