mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference
https://localai.io
MIT License
23.24k stars 1.76k forks source link

feat: cuda support in transformers backend #1344

Open kno10 opened 9 months ago

kno10 commented 9 months ago

The transformers backend does not appear to use the GPU. I could not get any of the settings such as cuda: true or device: cuda to work. In the code, I could not see any of these options being used in the backend, either, yet. Maybe copy the relevant lines from the diffusers backend? https://github.com/mudler/LocalAI/blob/9482acfdfc502fcbf29efa7a4feda045d768f95a/backend/python/diffusers/backend_diffusers.py#L223-L233

mudler commented 9 months ago

The transformers backend does not appear to use the GPU. I could not get any of the settings such as cuda: true or device: cuda to work. In the code, I could not see any of these options being used in the backend, either, yet. Maybe copy the relevant lines from the diffusers backend?

https://github.com/mudler/LocalAI/blob/9482acfdfc502fcbf29efa7a4feda045d768f95a/backend/python/diffusers/backend_diffusers.py#L223-L233

good point - it was not hooked as I could not test it as well. please feel free to take a stab at it

localai-bot commented 9 months ago

:warning::warning::warning::warning::warning:

Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!

_but.... I can also be funny or helpful :smilecat: and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.

Don't engage in conversation with me, I don't support (yet) replying!

:warning::warning::warning::warning::warning:

I see that the issue is related to the transformers backend not using the GPU. To enable the GPU usage for the transformers backend, you can modify the diffusers backend configuration file in the models directory and set the cuda parameter to true. Here's an example of a modified configuration file:

name: transformers_gpu
parameters:
  model: transformers
  backend: diffusers
  cuda: true
diffusers:
  pipeline_type: StableDiffusionPipeline
  cuda: true
  enable_parameters: "negative_prompt,num_inference_steps,clip_skip"
  scheduler_type: "k_dpmpp_sde"
  cfg_scale: 8
  clip_skip: 11

This configuration file sets the cuda parameter to true in the diffusers backend and enables the GPU usage for the transformers model. You can modify other parameters such as step, f16, or scheduler_type according to your needs.

Sources:

LeonSijiaLu commented 4 months ago

I think this issue has been addressed https://github.com/mudler/LocalAI/blob/master/backend/python/transformers/transformers_server.py#L91, looks like we can close it, but please correct me if I missed something @mudler