Voices are either too squeaky or too deep in the output

yxlllc / ReFlow-VAE-SVC

MIT License

76 stars 12 forks source link

Open Kamii-Sam opened 2 months ago

Kamii-Sam commented 2 months ago

I trained a model and with the output, some parts of it sound squeaky and other parts sound deeper than usual.

python main.py -i input.wav -m exp/reflowvae-wavenet-attention/model_3400.pt -o output.wav -k 0 -f 0 -tid 1 -step 50

Above is the command I'm using. I'm confused as to why it's doing this.

yxlllc commented 2 months ago

Did you use a pretrained model? If not, 3400 training steps is not enough for convergence and requires at least several hours of audio data

Kamii-Sam commented 2 months ago

Yes, I did use a pretrained model. I managed to actually fix the issue by turning on pitch augmentation in the config settings.