auspicious3000 / autovc

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
https://arxiv.org/abs/1905.05879
MIT License
1.01k stars 207 forks source link

How to reproduce the result on VCTK dataset? #71

Open liangshuang1993 opened 3 years ago

liangshuang1993 commented 3 years ago

I run make_spect.py and make_metadata.py to prepreocess the dataset (I used all speakers in VCTK). And then I used pretrained model of Speaker Encoder to extract speaker embedding and train the model. The final loss is about 0.03. Are there anyone reproduce the result successfully? Could you help me? Thanks!

ghost commented 3 years ago

After days of working on this project, I tried reproducing the results but all I get is silence, no voice.