TensorSpeech / TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
https://tensorspeech.github.io/TensorFlowTTS/
Apache License 2.0
3.84k stars 815 forks source link

Multi-speaker audio samples for FastSpeech/FastSpeech 2 #250

Closed ming024 closed 4 years ago

ming024 commented 4 years ago

It is really an astonishing large project.

I have seen that there is multi-speaker support in the preprocessing scripts and model configs. It will be great if anyone can share the multi-speaker audio samples generated with the non-autoregressive models such as FastSpeech and FastSpeech2.

dathudeptrai commented 4 years ago

multi-lingual-multi-speaker.zip

This is a result of my model trained on M-ailab dataset (partly and the dataset quality is not good.) with other datasets, let say this is an audio generated by multi-lingual-multi-speaker dataset (english is not the main language so if you train the model with only eng language, the result should be better), the model is modified a bit to work with multi lingual, i also tested the public code here for multi speaker and it work fine.

dathudeptrai commented 4 years ago

@ming024 if you don't have any question, pls close issue :D

ming024 commented 4 years ago

I think the audio quality is very good. How many speakers are there in your datasets?

dathudeptrai commented 4 years ago

I think the audio quality is very good. How many speakers are there in your datasets?

Sorry, that is private infomation :D.

ZDisket commented 4 years ago

I have some too from MFA-aligned FS2. This is 11 speaker samples, total 18 hours of audio distributed over 136 speakers (distribution not uniform, some have as little as 20 seconds while others 50 minutes) 11samples-50ks.zip

ming024 commented 4 years ago

@dathudeptrai @ZDisket Thanks a lot, I will close this issue #250