erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
GNU Affero General Public License v3.0
1.17k stars 124 forks source link

Multi-GPU setup using the slowest available GPU. #213

Closed Urammar closed 6 months ago

Urammar commented 6 months ago

As described, I have a 1080ti and a 3090 dedicated to language models. After installing the 1080 as the daily driver, alltalk is using this to generate, significantly slowing things down compared to the 3090 alone.

Can the gpu device id be forced?

erew123 commented 6 months ago

@Urammar You can force Python to use a specific GPU, but its a system wide setting:

https://github.com/erew123/alltalk_tts?tab=readme-ov-file#startup-performance-and-compatibility-issues

See I have multiple GPU's and I have problems running Finetuning for details of the setting.

I don't know of a way to specifically bind just AllTalk (when running as part of Text-gen-webui) to a specific card as the Coqui scripts don't have that feature/ability.

So setting the above setting would also force text-gen-webui to only the 1x card too (I believe) if both things are being run at the same time.

In the next version of AllTalk, I will have built a remote extension for text-gen-webui. Which will allow you to run AllTalk as a completely separate instance, with separate environment variables for AllTalk and Text-gen-webui, so that would allow you to lock one terminal/command prompt/Python environment to one GPU and the other can use both (or specify the other GPU).

However, this version of AllTalk is a while off yet, as its still in development.

image

Thanks

Nrgte commented 2 months ago

Since it's been quite a while I wanted to ask, if it's now possible with v2beta (standalone) to load the TTS model into a specific GPU?

erew123 commented 2 months ago

@Nrgte

Only by setting the Python environment variables in your command prompt/console e.g. you would open a command prompt or terminal, set the CUDA_VISIBLE_DEVICES variable for that command prompts/terminal's session (until you close it). CUDA_VISIBLE_DEVICES applies to anything that is using CUDA in that command prompt/terminal window.

I have multiple GPU's and I have problems running Finetuning https://github.com/erew123/alltalk_tts?tab=readme-ov-file#startup-performance-and-compatibility-issues

NVidia Explanation https://developer.nvidia.com/blog/cuda-pro-tip-control-gpu-visibility-cuda_visible_devices/

image

Nrgte commented 2 months ago

It worked. By setting SET CUDA_VISIBLE_DEVICES=1 in start_alltalk.bat it now loads the model in the correct GPU.

Thanks!