liusongxiang / StarGAN-Voice-Conversion

This is a pytorch implementation of the paper: StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks
https://arxiv.org/abs/1806.02169
513 stars 93 forks source link

Error in training with more than 4 speakers #20

Open sainishalini opened 2 years ago

sainishalini commented 2 years ago

Hi there, Thanks for the code. I am able to work with the code with 4 speakers in folders Speaker and Speaker_test but if I increase the number of the speaker to 6, preprocess.py file runs fine but training through main.py is throwing an error saying that "target x is out of bounds" I am keeping data very small (100 samples for all 6 speakers to run successfully and then increasing the voice sample size. I try to read the code but not able to debug thoroughly yet.

UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument. input = module(input) Traceback (most recent call last): File "main.py", line 92, in main(config) File "main.py", line 34, in main solver.train() File "/Users/shalinisaini/pytorch-StarGAN-VCtk-oct-6small/solver.py", line 160, in train cls_loss_real = CELoss(input=cls_real, target=speaker_idx_org) File "/Users/shalinisaini/opt/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/Users/shalinisaini/opt/anaconda3/lib/python3.8/site-packages/torch/nn/modules/loss.py", line 931, in forward return F.cross_entropy(input, target, weight=self.weight, File "/Users/shalinisaini/opt/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 2317, in cross_entropy return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction) File "/Users/shalinisaini/opt/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 2115, in nll_loss ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index) IndexError: Target 4 is out of bounds.

---can anyone tell if this code can be expanded for more than 4 speakers. I am not sure if I am missing something stupid but please see if anyone can help in the right direction. I would like to run the training for more speakers.

Thanks for your help.