bentoml / OpenLLM

Run any open-source LLMs, such as Llama, Mistral, as OpenAI compatible API endpoint in the cloud.
https://bentoml.com
Apache License 2.0
10.12k stars 641 forks source link

How to deploy a model using a single machine multi card approach? #1026

Open ttaop opened 5 months ago

bojiang commented 3 months ago

For now, we need to follow the developer guide: https://github.com/bentoml/OpenLLM/blob/main/DEVELOPMENT.md But it is complex.

We are designing an easier way to archive that.