mozilla / TTS

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Mozilla Public License 2.0
9.27k stars 1.24k forks source link

Readme shows how to get program, but not which models correspond to advertisement at top #738

Closed deliciouslytyped closed 2 years ago

deliciouslytyped commented 2 years ago

I'm interested in using TTS for voice synthesis for reading text.

Maybe I can't find my way around here, but the readme gives a clear and concise way to get the tool for using this text-to-speech library/program, but then I can't find anything about which of the models advertised at the top, with good performance, match the models listed in the program:

$ tts --list_models
 Name format: type/language/dataset/model
 1: tts_models/en/ek1/tacotron2
 2: tts_models/en/ljspeech/tacotron2-DDC
 3: tts_models/en/ljspeech/tacotron2-DDC_ph
 4: tts_models/en/ljspeech/glow-tts
 5: tts_models/en/ljspeech/speedy-speech
 6: tts_models/en/ljspeech/tacotron2-DCA
 7: tts_models/en/ljspeech/vits
 8: tts_models/en/ljspeech/fast_pitch
 9: tts_models/en/vctk/sc-glow-tts
 10: tts_models/en/vctk/vits
 11: tts_models/en/vctk/fast_pitch
 12: tts_models/en/sam/tacotron-DDC
 13: tts_models/es/mai/tacotron2-DDC
 14: tts_models/fr/mai/tacotron2-DDC
 15: tts_models/uk/mai/glow-tts
 16: tts_models/zh-CN/baker/tacotron2-DDC-GST
 17: tts_models/nl/mai/tacotron2-DDC
 18: tts_models/de/thorsten/tacotron2-DCA
 19: tts_models/ja/kokoro/tacotron2-DDC
 20: vocoder_models/universal/libri-tts/wavegrad
 21: vocoder_models/universal/libri-tts/fullband-melgan
 22: vocoder_models/en/ek1/wavegrad
 23: vocoder_models/en/ljspeech/multiband-melgan
 24: vocoder_models/en/ljspeech/hifigan_v2
 25: vocoder_models/en/ljspeech/univnet
 26: vocoder_models/en/vctk/hifigan_v2
 27: vocoder_models/en/sam/hifigan_v2
 28: vocoder_models/nl/mai/parallel-wavegan
 29: vocoder_models/de/thorsten/wavegrad
 30: vocoder_models/de/thorsten/fullband-melgan
 31: vocoder_models/ja/kokoro/hifigan_v1

These names are different than what's in the nice graph.

TL;DR: how do I get the best sounding model? (wavenet?)

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discourse page for further help. https://discourse.mozilla.org/c/tts