auspicious3000 / autovc

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
https://arxiv.org/abs/1905.05879
MIT License
983 stars 207 forks source link

confusion about model update #55

Open jayzhu02 opened 3 years ago

jayzhu02 commented 3 years ago

Hello! I'm quite exciting after reading your paper and running your pre-train model. Now I want to train my own datasets and I successfully generate train.pkl and spectrogram. But while I read the code, it seems that it won't save the new model. And also in conversion.ipynb, it loads the autovc.ckpt which is not generated by pytorch. I'm new to pytorch so I hope you can give me some guidance about how to train and test my own dataset. Thx!

ngulya commented 3 years ago

@zj19980122 torch.save({'model': self.G.state_dict(), 'optimizer': self.g_optimizer.state_dict()}, '1autovc.ckpt') insert this code in solver_encoder.py in last row

jayzhu02 commented 3 years ago

@zj19980122 torch.save({'model': self.G.state_dict(), 'optimizer': self.g_optimizer.state_dict()}, '1autovc.ckpt') insert this code in solver_encoder.py in last row

Thanks a lot! I've already figure out how to build my model. But I still have a question: while training my own dataset(randomly chosen 40 speakers from VCTK), the loss_id and loss_id_psnt always wander between 0.01 and 0.02 after 15.5k step. The output quality doesn't sound good. I'm wondering whether it still need to train or the parameters need to change. My parameters are all default except dim_neck and freq are changed to 32.

ngulya commented 3 years ago

I don't know really but how I remember in step where prepared data was adding random noise - maybe this answer to your question?

jayzhu02 commented 3 years ago

I don't know really but how I remember in step where prepared data was adding random noise - maybe this answer to your question?

Thx! I'll have a try.

Tinglok commented 3 years ago

I don't know really but how I remember in step where prepared data was adding random noise - maybe this answer to your question?

Thx! I'll have a try.

Hi @zj19980122, I have this problem too. Have you solved it?

xelsa commented 3 years ago

@zj19980122 I also have the exact same problem. I used the default parameters except dim_neck and freq changed to 32. The output is literally just noise.

GreatDarrenSun commented 3 years ago

@zj19980122 I have the same problem,how did you solve it?

ghost commented 3 years ago

Same here.