MycroftAI / mimic2

Text to Speech engine based on the Tacotron architecture, initially implemented by Keith Ito.
Apache License 2.0
580 stars 103 forks source link

Use Keithito's or Zamia's models? #40

Open dpny518 opened 5 years ago

dpny518 commented 5 years ago

Do you have some pretrained models? Besides that one, can we use Keitho's models?

http://data.keithito.com/data/speech/tacotron-20180906.tar.gz

https://drive.google.com/file/d/1c_O-Gha03_erKbilsFCvs9QJ8faJ7ou8/view?usp=sharing

Or Zamia's models? https://goofy.zamia.org/zamia-speech/tts/tacotron/

NightMachinery commented 5 years ago

I looked at the opensource TTS solutions about a month ago, and I did not find any that had a Getting Started guide with (good) pretrained models. I ended up buying Voice Dream.

dpny518 commented 5 years ago

This is really good, has fast API and pertained models that work from nvidia and sound good

https://github.com/Verssae/flask-tacotron2-tts-web-app

NightMachinery commented 5 years ago

@yondu22 Thanks! Two points: Does it work without a GPU? What’s the bash code that gets some text and outputs audio? The README only hints at a human web interface.

The quality of samples at https://nv-adlr.github.io/WaveGlow is fascinating!

dpny518 commented 5 years ago
  1. It is an api, https://github.com/Verssae/flask-tacotron2-tts-web-app/blob/master/app.py#L21 So you would just modify that or use console_test.py but app.py is better since it keeps the model loaded in memory, so basically, run python app.py and then create your own bash script to send post with text to that
  2. Does it work without GPU? no you to replace the same files in this git with the files from this git https://github.com/shoegazerstella/tacotron2_cpu