google / aiyprojects-raspbian

API libraries, samples, and system images for AIY Projects (Voice Kit and Vision Kit)
https://aiyprojects.withgoogle.com/
Apache License 2.0
1.63k stars 694 forks source link

Make TTS Match Assistant Quality/Tone #562

Open LizMyers opened 5 years ago

LizMyers commented 5 years ago

The current TTS voice is very muffled and sometimes difficult to understand. For further context, please see this 30 sec. demo.

My problem is that I want to be able to use the API with the same audio quality as with the Assistant or other GRPC demoes included with the AIY-Projects image.

BTW - I find it slightly confusing that there are demos with different agents and mixed audio quality included within the latest image. Sometimes I hear a US male voice, sometimes there's a UK female voice, and when using tts - the muffled robotic voice.

Thanks in advance for your time and attention.

PS: The full project is currently featured on Hackster.io - so if it's possible to improve the audio quality (programmatically) - I'm eager to do it ASAP.

dmitriykovalev commented 5 years ago

We currently use pico tts engine. It works on-device and doesn't require internet connection. There are other on-device tts python libraries available.

As an option you can try to use Google Cloud Text-to-Speech API which requires internet connection. We are going to provide Python code example with the next SD card image release.

LizMyers commented 5 years ago

Thanks Dmitriy - I'm looking forward. My project works with Google Cloud so requires internet anyway. Does TTS suffer latency problems? Why the emphasis of on device tts?

dmitriykovalev commented 5 years ago

@LizMyers There are no latency issues, it's up to you which tts engine to use. If you are already using internet connection then it's perfectly fine!

olafthiele commented 5 years ago

Is there any Python code already available to use the cloud tts? Couldn't find any