Closed powei-C closed 2 years ago
Hi @powei-C, Thanks for your interest in our work. Actually we used the Resnet-9blocked generator in CycleGAN-VC3 for fair comparison in our paper, since we'd like to demonstate that cycle-consistency loss can be replaced by the contrastive loss.
Thank you for your explanation!
I still have one question. When listening to the demo audio, I found that the audio's speed slows down a little bit compared to the source audio, even though I retrain the model and test it, there is a similar situation. Is it reasonable?
This may be because the hyperparameters (win_length, hop_size, etc) of PWG vocoder do not match my Mel-spectrogram extraction code here. Therefore you could make them consistent, which may solve the problem. Please keep me posted if it works.
Thank you for your instant reply! It makes sense!!! Since I have run out of my GPU resources recently, I may test it a few days later.
Best Regards
It worked! Thank you for your help!
Best Regards
It worked! Thank you for your help!
Best Regards
@powei-C How did you do the settings?
@Quetzalquatl It's been a long time... I am not 100% sure right now. I remember that I just adjust the parameter's value in CVC/data/wav_folder.py to match the one in "/CVC/checkpoints/vocoder/config.yml" such as hop size, fft size and win_length .... Hope this help~
@powei-C I changed the settings in wav_folder.py according to the parameters in vocoder/config.yml but the result is very bad. no matter what i did i couldn't fix it well. If yes, could you paste your settings here?
Hi, Thank you for sharing the paperwork! I wonder if you trained the Resnet-9blocked generator in CycleGAN-VC3 for your paper evaluation. Since I followed your command instruction for training CyleGAN-VC3, it uses the same "netG" as the CVC framework.