cd /path/to/tortoise-tts/scripts/
python tortoise_tts.py -p "fast" -v deniro -O /path/to/output/ "[poem] Love, me tender... Love me sweet... Never let me go... For this makes, my life complete... Never let me go?"
Censored logs
Rendering deniro_00 (1 of 1)...
[poem] Love, me tender... Love me sweet... Never let me go... For this makes, my life complete... Never let me go?
Generating autoregressive samples..
100%|██████████| 96/96 [15:21<00:00, 9.60s/it]
Computing best candidates using CLVP
100%|██████████| 96/96 [00:07<00:00, 13.33it/s]
Transforming autoregressive outputs into audio..
100%|██████████| 80/80 [00:45<00:00, 1.75it/s]
Seems to take 8GB of RAM and CPU at single core 100% and also 3.5GB of VRAM (although the GPU has 6GB)
Seems to take a long time to process, about 5-10 minutes per generation of that size. Looks like too much but it is using the dGPU at about half processing capacity to generate the file.
Works here, though: https://huggingface.co/spaces/Manmay/tortoise-tts
Setup (note I installed 11.8 and not 11.7 because 11.7 was not working for me):
tortoise-tts Version
https://github.com/neonbjb/tortoise-tts/tree/1e061bc6752f05bccb59748c8bd7c7fc85d54988
Command:
Censored logs
Samples
deniro_00_00.webm deniro_combined.webm
System:
OS: linux; ubuntu 22.04 CPU: Ryzen 4800H GPU: NVIDIA TU106M [GeForce RTX 2060.2 Mobile]
Notes:
Seems to take 8GB of RAM and CPU at single core 100% and also 3.5GB of VRAM (although the GPU has 6GB) Seems to take a long time to process, about 5-10 minutes per generation of that size. Looks like too much but it is using the dGPU at about half processing capacity to generate the file. Works here, though: https://huggingface.co/spaces/Manmay/tortoise-tts