NVIDIA / flowtron

Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
https://nv-adlr.github.io/Flowtron
Apache License 2.0
887 stars 177 forks source link

Custom model resumed from pre-trained model has a stuttering problem. #142

Closed Jcwscience closed 2 years ago

Jcwscience commented 2 years ago

I am attempting to train a mode on my own voice, and I’m using the training script to warm-start from the ljs model. However the loss readout hovers around -1.0 and doesn’t converge to a number closer to zero. When I start the training, after about 10 epochs or so it begins to take on some of the timbre of my voice, but allowing it to continue invariably results in a stutter growing worse until the output is nothing but static or a scream.

I am relatively new to this process so any troubleshooting suggestions would really help. For instance, what loss should I be looking for? Do I need to adjust the training parameters for using a pre-trained model?

The data is 90 English sentences with the corresponding WAV files. They are at a samplerate of 22050 and formatted as 16bit pcm.

Any help or suggestions is much appreciated!