bentoml / OpenLLM

Run any open-source LLMs, such as Llama 3.1, Gemma, as OpenAI compatible API endpoint in the cloud.
https://bentoml.com
Apache License 2.0
9.39k stars 597 forks source link

feat(API): add light support for batch inference #1004

Closed aarnphm closed 1 month ago

aarnphm commented 1 month ago

Signed-off-by: paperspace 29749331+aarnphm@users.noreply.github.com