152334H / tortoise-tts-fast

Fast TorToiSe inference (5x or your money back!)
GNU Affero General Public License v3.0
779 stars 179 forks source link

Keeping tortoise loaded into VRAM #101

Open Stoxis opened 1 year ago

Stoxis commented 1 year ago

I want to be able to generate voice on the fly quickly so I wanted to know how I can keep the model loaded in VRAM so that tortoise_tts.py can instantly start generating auto-regressive samples.

The reason I want to do this is because I timed it and found that loading the model into VRAM takes about 1 minute but the voice generation afterwards only takes 16 seconds. I'm willing to give up that VRAM while I'm using it for the convenience and speed gained.

Stoxis commented 1 year ago

I wrote the script myself. There was a VRAM memory leak but I was able to fix it by running torch.cuda.empty_cache() at the end of the script.

seanthegoudarzi commented 1 year ago

@Stoxis Any chance you can share the script you made? It would be highly appreciated!