anonymous-pits / pits

PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor
https://anonymous-pits.github.io/pits/
MIT License
274 stars 34 forks source link

Artificial noise #24

Open gandolfxu opened 1 year ago

gandolfxu commented 1 year ago

I hsave trained an mandarin model using 3 hrs data. The dataset is very clean.

I have trained about 182k steps, 3132 epoches. I can hear artificial noise clearly. Why?

image

anonymous-pits commented 1 year ago

As I mentioned in recent readme update, since our method applies GAN for pitch-shifted synthesis, single speaker dataset or small size dataset is not very appropriate for adversarial training. Thus, I recommend you to apply more speakers or fine-tuning from pretrained model.