DigitalPhonetics / IMS-Toucan

Controllable and fast Text-to-Speech for over 7000 languages!
Apache License 2.0
1.47k stars 166 forks source link

How to improve unnatural German voice? #157

Closed TechInterMezzo closed 1 year ago

TechInterMezzo commented 1 year ago

I tried run_interactive_demo.py with LANGUAGE set to de and it sounds like any old TTS. It sounds a bit robotic, monotone and not natural. Can this be improved with further training? Should look for a better German dataset? Should I train from scratch or finetune the existing model?

Flux9665 commented 1 year ago

The biggest problem right now is the data. For English we have a couple of good datasets, but for German, all of the open datasets are somewhat flawed. Depending on the speaker embedding used you can however get pretty decent German speech out of the model.

If you want to train something yourself, I give the same recommendation as always: If you have more than 5 hours of high quality data, train from scratch. If you have less than that, finetune from the pretrained model.