aws / deep-learning-containers

AWS Deep Learning Containers are pre-built Docker images that make it easier to run popular deep learning frameworks and tools on AWS.
https://docs.aws.amazon.com/deep-learning-containers/latest/devguide/what-is-dlc.html
Other
1.01k stars 463 forks source link

[feature-request] Update vLLM library in LMI containers to v0.6.0 #4240

Open CoolFish88 opened 1 month ago

CoolFish88 commented 1 month ago

Concise Description:

vLLM v0.6.0 provides 2.7x throughput improvement and 5x latency reduction over the previous version (v0.5.3)

DLC image/dockerfile: 763104351884.dkr.ecr.us-west-2.amazonaws.com/djl-inference:0.29.0-lmi11.0.0-cu124 763104351884.dkr.ecr.us-west-2.amazonaws.com/djl-inference:0.29.0-neuronx-sdk2.19.1

Is your feature request related to a problem? Please describe. Improve the performance of LMI containters

Describe the solution you'd like Update vLLM library in LMI containers to v0.6.0

siddvenk commented 1 month ago

We are planning a release that will include vllm 0.6.2 within the next 2 weeks. In the meantime, you can try providing a requirements.txt with vllm==0.6.x and leverage a later version of vllm that way. If you go this route, you should also set OPTION_ROLLING_BATCH=vllm environment variable to force usage of vllm