jaywalnut310 / vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
https://jaywalnut310.github.io/vits-demo/index.html
MIT License
6.48k stars 1.21k forks source link

Issue with training at 8000Hz #176

Open athenasaurav opened 10 months ago

athenasaurav commented 10 months ago

Hello Everyone,

I am trying to train on 8000Hz using VITS. But the voices are not clear after 180k steps. It looks like it's mumbling.

I have my own dataset recorded at 8000Hz.

Here is the sample of original recording

Also the generated audio sound like this

Do I need to change something in training in the config? Can someone suggest what changes need to be done?