keonlee9420 / StyleSpeech

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
MIT License
190 stars 23 forks source link

What is the perfermance compared with Adaspeech #2

Closed Liujingxiu23 closed 3 years ago

Liujingxiu23 commented 3 years ago

Thank you for your great work and share. Your work looks differ form adaspeech and NAUTILUS. You use GANs which i did not see in other papers regarding adaptative TTS. Have you compare this method with adaspeech1/2? how about the mos and similarity?

keonlee9420 commented 3 years ago

Hi @Liujingxiu23 , thank you for the attention. Unfortunately, I didn't have compared this model with adaspeech1/2. If you have any chance to do it, then please share the results here. It would also be helpful for others.

Liujingxiu23 commented 3 years ago

@keonlee9420 thank you for your reply! Adaspeech2 follows the structure of NAUTILUS which is friendly to untranscribed speech data. The results are of fluent and of relatively high quality, but the similarity is not very high.

keonlee9420 commented 3 years ago

I see. Thanks for the sharing. What about Adaspeech3? Do you have any insight on comparison? I've planed to implement it soon and want to see how it differs from its ancestors and other TTS models.

Liujingxiu23 commented 3 years ago

@keonlee9420 I have not do experiment of Adaspeech3.

keonlee9420 commented 3 years ago

Got it. It could be interesting to compare it too after then.

keonlee9420 commented 3 years ago

Close due to inactivity.

Pydataman commented 2 years ago

@keonlee9420 thank you for your reply! Adaspeech2 follows the structure of NAUTILUS which is friendly to untranscribed speech data. The results are of fluent and of relatively high quality, but the similarity is not very high.

test in Zn? how many speakers of data? I use own data,2 speakers ,but bad do you have weixin?

Pydataman commented 2 years ago

@Liujingxiu23 hi how can i contact you

Liujingxiu23 commented 2 years ago

@Pydataman 2 speakers are too few, I guess 500~2000+ may be suitable.

Pydataman commented 2 years ago

@Pydataman 2 speakers are too few, I guess 500~2000+ may be suitable.

@Liujingxiu23 how many speakers did you test?