KimythAnly / AGAIN-VC

This is the official implementation of the paper AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization.
https://kimythanly.github.io/AGAIN-VC-demo/index
MIT License
111 stars 19 forks source link

about the training process #7

Closed chikiuso closed 3 years ago

chikiuso commented 3 years ago

Hi, I run the training and it halts on here :

0%| | 0/3 [00:00<?, ?it/s]> /root/againvc/dataloader/vctk.py(48)getitem() -> neg_i = np.random.choice(len(candidates) - 1) (Pdb)

Do you have any idea anything I make it wrong? thanks!

MaxGodTier commented 3 years ago

I have faced this issue before, renaming my dataset to follow the same structure logic of VCTK corpus solved it, probably something goes wrong when making the indexes with different filenames under default settings.

I would be interested to know more about the training procedure as well, currently it's not clear what are the dataset requirements and rules, the DO'S and DON'TS, in a previous issue you mentioned that VCTK is a non-parallel dataset, that is technically true as the spoken sentences don't overlap perfectly, however the sentences are the same for all speakers (ie. pXXX_001.wav will be "Please call Stella" for all 110 speakers) making it quasi-parallel, at this point I'm not sure whatever it's possible to have p001/p001_001.wav and p002/p002_001.wav say something completely different without dramatically affecting the training procedure.

I trained with a custom dataset containing vocals only from different singers and got bad results such as speakers not matching the target voice or sounding like they were angry/screaming/growling (even if the target file was soft spoken), however the dataset contained high pitched screams/growls, so I suspect that could have affected the training negatively as it's harder to associate a scream/growl to a specific person, does setting a higher "f_max" during preprocessing could help?

chikiuso commented 3 years ago

Hi @MaxGodTier , I tried it with vctk dataset but no luck, hope that I could figure out in the future.

MaxGodTier commented 3 years ago

Try reinstalling all dependencies on a fresh environment, that's the first thing I did.

KimythAnly commented 3 years ago

Sorry for the delayed response. This part of the code is redundant for training AGAIN-VC. I removed it in the recent commit.