JarodMica / ai-voice-cloning

GNU General Public License v3.0
656 stars 144 forks source link

Suddenly I can only create 'Ultra Fast' level output. Even set to 'High Quality' #37

Open andmillward opened 9 months ago

andmillward commented 9 months ago

I downloaded this yesterday. I followed the video tutorial. I trained a model on janky audio I clipped quickly just to practice. On the high quality setting I was very impressed. There were artifacts but it sounded very accurate, although it took an hour to generate one sentence. So I figured I'd train a new model with better audio to try for perfect results.

After doing so, tts generation on any model / voice or quality preset only takes a few seconds and sounds awful. I noticed it's taking much less vram as well. I made sure to restart tts with the correct model selected in the settings. What could be the issue here?

Windows RTX 4080

JarodMica commented 9 months ago

By chance, did you turn on hifigan? If you have hifigan turned on, the diffusion model settings (iterations and samples) don't matter.