I downloaded this yesterday. I followed the video tutorial. I trained a model on janky audio I clipped quickly just to practice. On the high quality setting I was very impressed. There were artifacts but it sounded very accurate, although it took an hour to generate one sentence. So I figured I'd train a new model with better audio to try for perfect results.
After doing so, tts generation on any model / voice or quality preset only takes a few seconds and sounds awful. I noticed it's taking much less vram as well. I made sure to restart tts with the correct model selected in the settings. What could be the issue here?
I downloaded this yesterday. I followed the video tutorial. I trained a model on janky audio I clipped quickly just to practice. On the high quality setting I was very impressed. There were artifacts but it sounded very accurate, although it took an hour to generate one sentence. So I figured I'd train a new model with better audio to try for perfect results.
After doing so, tts generation on any model / voice or quality preset only takes a few seconds and sounds awful. I noticed it's taking much less vram as well. I made sure to restart tts with the correct model selected in the settings. What could be the issue here?
Windows RTX 4080