mozilla / TTS

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Mozilla Public License 2.0
9.43k stars 1.26k forks source link

High Quality Audio Samples? #6

Closed PetrochukM closed 6 years ago

PetrochukM commented 6 years ago

Do you have audio samples of similar quality to: https://google.github.io/tacotron/publications/tacotron/index.html

Thanks!

erogol commented 6 years ago

In progress, I had the model but removed it accidentally. Retraining it. After that, I plan to release some samples.

PetrochukM commented 6 years ago

@erogol Thank you!

claytonblythe commented 6 years ago

I am interested in this as well!

erogol commented 6 years ago

Here is a simple example: https://soundcloud.com/user-565970875/tts-ljspeech-val-13585 It is generated by a text from the validation set. So it is identical to test time experience. It is also from iteration 13585 so it is not finished yet. In general, other party's results are shared after >100K iterations.

Hope that gives a initial proof of work.

Model is trained by the branch 'normal-attention+masked-loss'

claytonblythe commented 6 years ago

Great thank you for adding the sample! Great work. This is pretty decent, I'm excited to follow this repo and contribute if I can find time/alignment with work

erogol commented 6 years ago

One more here after more training

https://soundcloud.com/user-565970875/tts-ljspeech-val-35000

erogol commented 6 years ago

I added things to readme so I close this.

PetrochukM commented 6 years ago

@erogol Hi There! Was this generated with a Wavenet vocoder or? Assuming not.

erogol commented 6 years ago

@PetrochukM no it is griffin-lim (if I wrote it right) :)