mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference
https://localai.io
MIT License
23.37k stars 1.78k forks source link

Support DeepSpeed FastGen #1538

Open thiner opened 8 months ago

thiner commented 8 months ago

Is your feature request related to a problem? Please describe.

No.

Describe the solution you'd like

DeepSpeed FastGen is an inference framework developed by MicroSoft. They claim that it's two times faster than vllm. https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-fastgen

Describe alternatives you've considered

No.

Additional context

I haven't tested FastGen, just attracted by their blog. I searched in this repo, seems no one mentioned this framework yet, so I'd like to bring it to the attention of community.

thiner commented 8 months ago

Glad to see you have added it to the roadmap.

mudler commented 8 months ago

Glad to see you have added it to the roadmap.

sounds a solid backend to have, thanks for the tip :+1: good to see that there is interest in this backend being added. Definetly a good addition for LocalAI