bentoml / OpenLLM

Run any open-source LLMs, such as Llama 3.1, Gemma, as OpenAI compatible API endpoint in the cloud.
https://bentoml.com
Apache License 2.0
9.39k stars 597 forks source link

How to deploy a model using a single machine multi card approach? #1026

Open ttaop opened 4 weeks ago