152334H / tortoise-tts-fast

Fast TorToiSe inference (5x or your money back!)
GNU Affero General Public License v3.0
759 stars 177 forks source link

Is there really speed up? #50

Closed C00reNUT closed 1 year ago

C00reNUT commented 1 year ago

Hello, thank you for trying to optimize the tortoise library, I am trying to compare the speed between the two implementations, but so far I am getting very similar results in both quality and speed. I use NVIDIA 3060 with 12GB VRAM.

Running the script bellow takes about 2m and 14s.

python scripts/tortoise_tts.py -p ultra_fast -O results/best_short_15/ultra_fast -v best_short_15 <text_short.txt --sampler dpm++2m --diffusion_iterations 30 --vocoder Univnet

image

Using the same setting, but using original repo tag in CLI takes about 2m and 2s.

python scripts/tortoise_tts.py -p ultra_fast -O results/best_short_15/ultra_fast_original -v best_short_15 <text_short.txt --original_tortoise

image

Am I missing something? Probably some tags I should add to speed up the generation?

darkconsole commented 1 year ago

for me there was exactly 0% speed change no matter how much i did the readme verbatim, or my own experimentation. also a lot of my voices that were fine in the old one didn't work on this. the ones that did though, were better/clearer so i just kinda shrugged that off and was like ok cool, alternate choice.

(rtx 3060 here)

152334H commented 1 year ago

using original repo tag in CLI

the original repo tag uses the kv_cache speedup features ...... try using the actual original tortoise repo, it will be much slower

rikabi89 commented 1 year ago

Check your GPU VRAM is being utilized . This could be a pytorch issue.

C00reNUT commented 1 year ago

Check your GPU VRAM is being utilized . This could be a pytorch issue.

Yeah it could be part of the problem. But I tried original repo and the average time for generation using the same sampling method and vocoder is about the same.

Anyway, I will try to use Pytorch 2.0 https://pytorch.org/blog/accelerated-diffusers-pt-20/ there seem to be interesting speedup possibilities out of the box.

Also https://github.com/nebuly-ai/nebullvm looks promising to me