Open sleepwalker2017 opened 7 months ago
Please use the vllm.entrypoints.openai.apI_server
instead, which have similar functionality and OpenAI compatible API.
Please use the
vllm.entrypoints.openai.apI_server
instead, which have similar functionality and OpenAI compatible API.
thank you. Is there any script to benchmark the multiple lora serving?
I don't believe we have a full benchmark on LoRA. Please check out what we have in benchmarks/
in the repo and contribution welcomed!
I don't believe we have a full benchmark on LoRA. Please check out what we have in
benchmarks/
in the repo and contribution welcomed!
ok, I find some performance issues and post it here https://github.com/vllm-project/vllm/issues/3219.
I notice that there is no --lora-modules argument in the
vllm.entrypoints.api_server
, which means I must add the lora local path when sending request.That's unrealistic. Because the client doesn't know the lora path.
Any plan to fix it ?