M4rtinK / modrana

ModRana is a flexible GPS navigation system for mobile devices. This is the main upstream modRana source code repository - waiting for your pull requests & patches! :)
www.modrana.org
GNU General Public License v3.0
66 stars 21 forks source link

Turn-by-turn voice output on Sailfish OS #200

Open M4rtinK opened 7 years ago

M4rtinK commented 7 years ago

While voice sample based output for navigation messages would be nice to have & the voice samples produced for Marble should be usable there are some issues:

So let's go with text so speech, at least for now. :)

What needs to b done:

bonanza123 commented 7 years ago

Is there any progress (or rough ETA) regarding the TTS based turn-by-turn navigation?

M4rtinK commented 7 years ago

I'll be traveling in the next three weeks (and I of course plan to use modRana during my travels :) ), so likely not sooner than that. On the other hand quite a few of the building blocks and if espeak still works correctly it should not be that hard to get it all working once I'm back. :)

rinigus commented 7 years ago

@M4rtinK, I worked on turn-by-turn voice prompts for Poor Maps and it turned out to be rather large patch (still waiting for review merge in Poor Maps). I am sure we can reuse large sections of the code here as well. In particular, it supports several TTS engines (mimic and picotts are much better than espeak), generation of voice commands in advance and playing them when needed.

To work on this, you would probably need to enable navigation layout first. Then the voice can be integrated into it as well. At least, that's the way I implemented it for Poor Maps.

M4rtinK commented 7 years ago

@M4rtinK, I worked on turn-by-turn voice prompts for Poor Maps and it turned out to be rather large patch (still waiting for review merge in Poor Maps). I am sure we can reuse large sections of the code here as well. In particular, it supports several TTS engines (mimic and picotts are much better than espeak), generation of voice commands in advance and playing them when needed.

That sounds very cool & much more advanced than the simple espeak based TTS support in modRana! :) I'm definitely interested in integrating this into modRana as well. :)

Also thanks a lot for all the work with porting these nice TTS tools - I guess the quality of espeak output and not actually being able to use it on Sailfish OS without playing stuff over Gstreamer (it was like that last time) is likely one of the contributing factors I have not worked on this earlier while espeak based TTS has been present on desktop & the N900 for quite some time already.

To work on this, you would probably need to enable navigation layout first. Then the voice can be integrated into it as well. At least, that's the way I implemented it for Poor Maps.

I did some preliminary work on the navigation overlay support in the Qt 5 GUI - that's basically the only missing piece for rudimentary turn by turn navigation on Sailfish OS with modRana as the rest (direction generation and triggering) has been part of the multi-platform application core for ages. So I'll try to get that working as a priority so that we can continue forward from that. :)

M4rtinK commented 6 years ago

So there is new some placeholder TTS support via espeak I've added together with the navigation overlay. Next is making use of the advanced TTS handling code Rinigius created for Poor Maps. :)

rinigus commented 6 years ago

Great! Osmo is working on polishing the voice support for Poor Maps under the corresponding PR in its source tree. I would expect the code to improve after he has worked on it.

TTS, by itself, is supported via voice.py (https://github.com/rinigus/poor-maps/blob/voice/poor/voice.py) which is a wrapper around mimic, picoTTS, flite, and espeak. In this implementation, VoiceDirection generates commands into WAV files stored at /tmp and removes the files when not needed anymore. To support the voices that are computationally intensive, the voice prompt generation is done in separate thread. Its expected that you would call voice generator in advance (pre-generate few commands in front before you reach the needed time moment) and fetch them (as a filename) when you need it.