osmandapp / OsmAnd

OsmAnd
https://osmand.net
Other
4.47k stars 994 forks source link

TTS - special treatment of road numbers #3274

Closed gjedeer closed 6 years ago

gjedeer commented 7 years ago

It's a repost of this mailing list post: https://groups.google.com/forum/#!searchin/osmand/translation|sort:date/osmand/l6xcyqODvOs/CFm8XJhqBgAJ

I've been using OSMAnd with Polish TTS quite a lot recently. While the quality is usually good, there's one annoying feature which sounded ridiculous to the bystanders.

It's how it reads "turn left into 297th" - which obviously means road #297. The Polish translation is incomprehensible ("skręć w lewo w dwieśćie dziewięćdziesiąty siódmy").

My question is: is it possible to distinguish road numbers in ttsconfig.p so that it reads "turn left into road number 297", like all other navigation apps do? I can contribute the translation and figure out the .p file syntax and make a pull request if that's possible.

I believe that this can be changed in assemble_street_name in https://github.com/osmandapp/OsmAnd-resources/blob/master/voice/pl/config.p#L165 or in the CommandBuilder https://github.com/osmandapp/Osmand/blob/73dcab7d0b944e54a7fb08ee0f323e9588a2dfdb/OsmAnd/src/net/osmand/plus/voice/CommandBuilder.java#L157 if the Prolog engine is not smart enough to detect a numeric string.

It seems that, at TTS level, TTSSpan class would handle this properly.

sonora commented 7 years ago

It sounds like a translation issue, there should be no ordinary numbers used in this case. This needs to be corrected in the Polish ttsconfig.p file, the file you quote is automatically generated. Can you please reference a road in OSM so we can check how it is tagged?

vshcherb commented 7 years ago

TTSSpan is rather new API but that's good it is introduced even though that change will take some effort to implement it later.

gjedeer commented 7 years ago

Can you please reference a road in OSM so we can check how it is tagged?

There doesn't seem to be a way to trigger navigation voice without leaving home (actuallly travelling that route) so give me a few days please, I'll find an example next time I travel.

TTSSpan is rather new API

I didn't notice that, API 21, that's way higher than what OsmAnd is aiming at :/

sonora commented 7 years ago

Enable our development plugin. Then in its settings you will find a "Test voice prompts" button. Also: You can calculate a route from anywhere to anywhere, then select "Start simulation" (in settings or at the bottom of the dashboard), then hit the route config's GO button, and OsmAnd will voice-navigate the selected route.

gjedeer commented 7 years ago

I did try the second option but it did not speak to me.

In the debug plugin, "3.4 SR 80 toward Rome" exhibits the issue. 3.5 "Route 23" sounds correct.

It says "sr osiemdziesięciu" which isn't even an ordinal number. "Osiemdziesięciu" is how you'd say "80 people" in Polish, totally inadequate to a road number.

sonora commented 7 years ago

@vshcherb Yes, I guess something like TTSSpan may offer ways to improve this. But it may still remain very tricky, because the habits of how you would pronounce things are very region specific, and, frankly, I would not know how to correctly parameterize things generally.

Here are some examples:

So in a nutshell I think we cannot rush into implementation here because I am not sure we would have a valid strategy ready to improve things at all...

gjedeer commented 7 years ago

This is of course up to you, but I think just giving the translators the ability to treat road numbers differently would solve majority of the cases. Expanding the street name term into a tuple like (stuff_before_number, number, stuff_after_number) would:

gjedeer commented 7 years ago

But then we have regular street names containing numbers like "26 kwietnia" "1 mai" etc...

sonora commented 7 years ago

Yes, I think it is not as easy as saying "before" or "after" ...

From the languages I speak, it seems that a rule could be that spoken practice prefers the "shorter" expressions, i.e. "eighty" is shorter than saying "eight - zero", but "four o five" is shorter than saying "four hundred and five".

But then, I suspect this "rule" alone will also not fix much globally in all languages ...

vshcherb commented 7 years ago

My biggest concern is balance between time that feature would take to implement and benefit we achieve. Unfortunately I see today more important issues and biggest drawback that looks like it will require to reiterate all prolog files.

polarbearing commented 7 years ago

As a side note, this summer I was driving in Bavaria and found "S 60" was expanded to "Staatsstraße 60". Which was correct but annoying. However the guilty component was the Google TTS engine (de), not OsmAnd. Anything fed into the TTS engine as "S nn" was expanded this way, related to a road or not.

sonora commented 7 years ago

Yes, mid term this issue may actually be solved by TTS engines becoming more intelligent, so there is an overlap here of functionality OsmAnd ultimately will have to provide vs what TTS engines will pronounce intelligently anyhow.

gjedeer commented 7 years ago

I just wanted to point that it's plain incomprehensible in Polish right now, not just sounds odd. People who heard OsmAnd's TTS thought it went nuts and suggested taking exit #47 from a non-existant roundabout, while it was telling you to turn into road #47.

So while "four o five" freeway is a valid problem too, I don't think anyone would misunderstand the directions saying "four hundred five". It's not a matter of convenience, the TTS is unusable for regular users in Poland at this point.

sonora commented 7 years ago

I once posted an idea elsewhere that we may solve many ref related issue: if we inserted a blank after each character in a ref string before passing it to the voice engine.

While in many languages or countries blanks after each character might work well, in others you may restrict this to only around digits if there are 3 or more in succession.

Needs investigation, but may be a valid and not too intrusive fix?

gjedeer commented 7 years ago

That would not solve the original problem, possibly even making it worse (it's not how you read road numbers in Polish)

sonora commented 7 years ago

If this is an issue specific to the Polish language, I would think that it actually needs to be fixed in the TTS engine... have you tried several different ones to see if they all behave so badly?

gjedeer commented 7 years ago

I have not, I'm using a Google free phone myself and this limits my ability to try software from Google Play. My experiences described in this issue are from someone else's phone who was kind enough to let me do some tests. Using recorded voices myself.

On December 19, 2016 5:08:57 PM CET, Hardy notifications@github.com wrote:

If this is an issue specific to the Polish language, I would think that it actually needs to be fixed in the TTS engine... have you tried several different ones to see if they all behave so badly?

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/osmandapp/Osmand/issues/3274#issuecomment-268003821

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

sonora commented 7 years ago

Maybe you can try to sideload different TTS engines as apk files and install them to test.

It will be very hard to do anything about this issue at all if neither of us can reproduce the issue readily, or provide systematic test results on different engines (and preferably also devices) to pinpoint the issue in the first place.

sonora commented 6 years ago

Similar to #4217, hard to do something generally useful for now.