neonbjb / tortoise-tts

A multi-voice TTS system trained with an emphasis on quality
Apache License 2.0
12.55k stars 1.75k forks source link

Did anyone get the tortoise fast fork to run? #382

Open havok2-htwo opened 1 year ago

havok2-htwo commented 1 year ago

Hey, there is a fork: https://github.com/152334H/tortoise-tts-fast

that is ~5-10x faster, but I am not able to get it to run. Anyone else?

XOCODE-OP commented 1 year ago

yeah same troubles as installing base toirtoise tts but nothing more than that. not sure if its actually that much faster but it is faster

mechanicalsnowman commented 1 year ago

Try these instructions for tortoise-fast. https://github.com/152334H/tortoise-tts-fast/issues/60#issuecomment-1486913090

It's not 5-10x faster on my machine, maybe 10% faster. It has a nice UI though so kinda worth it. I may be taking this quote out of context as it's a bit vague to me - https://github.com/152334H/tortoise-tts-fast/issues/50#issuecomment-1473705807

the original repo tag uses the kv_cache speedup features ...... try using the actual original tortoise repo, it will be much slower

(The owner of the tortoise-tts-fast fork.)

Not sure if he means there that the very first repo of tortoise-tts was much slower than it is now. Maybe tortoise-tts-fast was once 5-10x faster than tortoise-tts, but since then the tortoise-tts author has made his own optimizations to close the gap? But anyway, a lot of comments reflect my personal experience: slightly faster but not the "advertised" speed. Still a nice fork regardless imo

louispaulet commented 1 year ago

Try these instructions for tortoise-fast. 152334H#60 (comment)

It's not 5-10x faster on my machine, maybe 10% faster. It has a nice UI though so kinda worth it. I may be taking this quote out of context as it's a bit vague to me - 152334H#50 (comment)

the original repo tag uses the kv_cache speedup features ...... try using the actual original tortoise repo, it will be much slower

(The owner of the tortoise-tts-fast fork.)

Not sure if he means there that the very first repo of tortoise-tts was much slower than it is now. Maybe tortoise-tts-fast was once 5-10x faster than tortoise-tts, but since then the tortoise-tts author has made his own optimizations to close the gap? But anyway, a lot of comments reflect my personal experience: slightly faster but not the "advertised" speed. Still a nice fork regardless imo

I have made a simple docker project, want to try it ?

My dockerfile is directly inspired from your comment https://github.com/152334H/tortoise-tts-fast/issues/60#issuecomment-1486913090. Allows to use the fast version in a single command. ⚠️ You will need 32Go ram to run docker in WSL like me.

bitnom commented 1 year ago

Try these instructions for tortoise-fast. 152334H#60 (comment) It's not 5-10x faster on my machine, maybe 10% faster. It has a nice UI though so kinda worth it. I may be taking this quote out of context as it's a bit vague to me - 152334H#50 (comment)

the original repo tag uses the kv_cache speedup features ...... try using the actual original tortoise repo, it will be much slower

(The owner of the tortoise-tts-fast fork.) Not sure if he means there that the very first repo of tortoise-tts was much slower than it is now. Maybe tortoise-tts-fast was once 5-10x faster than tortoise-tts, but since then the tortoise-tts author has made his own optimizations to close the gap? But anyway, a lot of comments reflect my personal experience: slightly faster but not the "advertised" speed. Still a nice fork regardless imo

I have made a simple docker project, want to try it ?

My dockerfile is directly inspired from your comment 152334H#60 (comment). Allows to use the fast version in a single command. warning You will need 32Go ram to run docker in WSL like me.

why did you archive it? just curious

louispaulet commented 1 year ago

@bitnom Sorry about that. I did end up using a Google Colab version.
My version could process a single audio clip, but then wouldn't clear the memory and I found myself out of RAM after 2-3 batches.
I never found a fix, moved to colab, and then I got asked for support.
I chose to close the project to show that it is a dead-end in its current state.

havok2-htwo commented 1 year ago

@bitnom Sorry about that. I did end up using a Google Colab version. My version could process a single audio clip, but then wouldn't clear the memory and I found myself out of RAM after 2-3 batches. I never found a fix, moved to colab, and then I got asked for support. I chose to close the project to show that it is a dead-end in its current state.

I get it to work doing this tutorial: https://medium.com/@martin-thissen/5x-faster-voice-cloning-tortoise-tts-fast-tutorial-5b8c1d4de975