Open gandolfxu opened 1 year ago
As I mentioned in recent readme update, since our method applies GAN for pitch-shifted synthesis, single speaker dataset or small size dataset is not very appropriate for adversarial training. Thus, I recommend you to apply more speakers or fine-tuning from pretrained model.
I hsave trained an mandarin model using 3 hrs data. The dataset is very clean.
I have trained about 182k steps, 3132 epoches. I can hear artificial noise clearly. Why?