Open Stoxis opened 1 year ago
I wrote the script myself. There was a VRAM memory leak but I was able to fix it by running torch.cuda.empty_cache()
at the end of the script.
@Stoxis Any chance you can share the script you made? It would be highly appreciated!
I want to be able to generate voice on the fly quickly so I wanted to know how I can keep the model loaded in VRAM so that tortoise_tts.py can instantly start generating auto-regressive samples.
The reason I want to do this is because I timed it and found that loading the model into VRAM takes about 1 minute but the voice generation afterwards only takes 16 seconds. I'm willing to give up that VRAM while I'm using it for the convenience and speed gained.