vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
27.92k stars 4.12k forks source link

[S-LoRA] Something is wrong with the s-lora arguments in server_api #3174

Open sleepwalker2017 opened 7 months ago

sleepwalker2017 commented 7 months ago

I notice that there is no --lora-modules argument in the vllm.entrypoints.api_server, which means I must add the lora local path when sending request.

That's unrealistic. Because the client doesn't know the lora path.

Any plan to fix it ?

simon-mo commented 7 months ago

Please use the vllm.entrypoints.openai.apI_server instead, which have similar functionality and OpenAI compatible API.

sleepwalker2017 commented 7 months ago

Please use the vllm.entrypoints.openai.apI_server instead, which have similar functionality and OpenAI compatible API.

thank you. Is there any script to benchmark the multiple lora serving?

simon-mo commented 7 months ago

I don't believe we have a full benchmark on LoRA. Please check out what we have in benchmarks/ in the repo and contribution welcomed!

sleepwalker2017 commented 7 months ago

I don't believe we have a full benchmark on LoRA. Please check out what we have in benchmarks/ in the repo and contribution welcomed!

ok, I find some performance issues and post it here https://github.com/vllm-project/vllm/issues/3219.