MycroftAI / lingua-franca

Mycroft's multilingual text parsing and formatting library
Apache License 2.0
75 stars 79 forks source link

Pronounce units #50

Open danielsjf opened 4 years ago

danielsjf commented 4 years ago

Units are often abbreviated such as '4 h' for 'four hours' or '3 kWh' for 'three kilowatthour'. Are these kind of conversions also on the roadmap?

It would be nice to have a function for pronounce_unit or extract_unit.

Kaligule commented 4 years ago

I guess we could name it extract_amount more accurately, where an amount is a quantifier (4) and a unit ('hours').

penrods commented 4 years ago

Originally this was directed at voice primarily when this was inside Mycroft Core. You don't say "four h", you say "four hours".

However, as a stand-alone library I can imagine text communication being consumed. On the output side I have flags to indicate if the output is going to be spoken or written (e.g. "four oh three" versus "4:03"). Maybe it would be useful to support this with hints?

There are other inherent issues with typed vs spoken, for example typos. Should we attempt to handle "four hoors"? "for hours"? "four ours"?

JarbasAl commented 4 years ago

https://github.com/MycroftAI/lingua-franca/pull/2 handles this

@penrods i do not think we should handle typos, that will quickly get out of hand, i wouldn't spend any effort on that, if someone needs it for a typed system they can use a autocorrect library, it's out of scope is what i am saying

ChanceNCounter commented 4 years ago

I think we should keep in mind that, even if this library is never used outside the Mycroft community, skill developers will inevitably want to parse numeric information for pronunciation. I don't know if LF needs to handle "four kWh," though it'd be nice. I do think LF should handle "4 kWh" --> "four kilowatt hours".