winddori2002 / TriAAN-VC

TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
MIT License
129 stars 12 forks source link

Mixed datasets and calculate the threshold value #12

Closed Blakey-Gavin closed 1 year ago

Blakey-Gavin commented 1 year ago

What may happen if I mix VCTK datasets with datasets from other languages? For example, I used a dataset mixed with VCTK and Chinese. How should I calculate the threshold value of test in "config/base.yaml" at this time?

winddori2002 commented 1 year ago

Hi,

In that case, it will be better to recalculate the threshold. #9 However, the voice encoder (speaker verification) is not trained on the Chinese dataset, I guess it may not work well.

Thanks.

Blakey-Gavin commented 1 year ago

Ok, thanks for your reply.

So do I need to retrain the CPC and Vocoder Parallel WaveGAN parts?

winddori2002 commented 1 year ago

Yes. Different language datasets will affect the model performance a lot.