Open LEEYOONHYUNG opened 3 years ago
@LEEYOONHYUNG Oh I just forgot to post the audio samples. I'll update the demo page some other day. Honestly speaking the quality of the synthesized LibriTTS samples is not as good as the result of the single speaker dataset. I guess it is because that the environment noises in the LibriTTS dataset are much severe than the LJSpeech dataset. It might be a good idea to apply some data cleaning tricks before training the TTS model.
I think it is quite natural learning multi-speaker TTS is more difficult. Thank you for your reply :D
@ming024 any chance you could post a pretrained model for the multi-speaker English dataset, LibriTTS?
Great work with this repo and thanks in advacne!
Hi, my name is Yoonhyung Lee, who is studying Text-to-Speech. Thank you for your nice implementation of FastSpeech2. It helped me a lot to study it, but a question occurred to me.
According to the README.md, it seems that you have trained FastSpeech2 using LibriTTS dataset, but I cannot see the audio samples. Did you use all of the 585hours dataset for the training? How well the FastSpeech work on multispeaker dataset?