coqui-ai / TTS

πŸΈπŸ’¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
33.25k stars 4.02k forks source link

Support for zero shot multi speaker TTS ? #1352

Closed captanlevi closed 2 years ago

captanlevi commented 2 years ago

πŸš€ Feature Description First of all, thank you for all the great work. I am looking for fine-tuning a TTS model for chinese accent for english. I have collected a dataset that consists of 80 hours of speech and annotations from 8 to 10 speakers. I wanted to fine tune a zero shot TTS model (end to end or not). But I cannot find anything in the documentation to do so.

Solution I would really appreciate if someone can point me in the right direction to achieve my task with this repo.

PS - it's a bit urgent

erogol commented 2 years ago

You mostly follow https://tts.readthedocs.io/en/latest/finetuning.html