runpod-workers / worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
MIT License
213 stars 82 forks source link

Docker image is taking too much time to build #37

Closed hiennef closed 7 months ago

hiennef commented 7 months ago

image

Hi guys, I am trying to build the Docker image, but it is taking too long during the 'install vllm' step. Is this normal? Is there any solution to resolve it?

python3.11 -m pip install -e git+https://github.com/runpod/vllm-fork-for-sls-worker.git@cuda-11.8#egg=vllm

alpayariyak commented 7 months ago

Just pushed an update to address this, let me know if you’re still facing the issue

hiennef commented 7 months ago

Just pushed an update to address this, let me know if you’re still facing the issue

https://github.com/runpod-workers/worker-vllm/commit/5aecb628702777356778dbb871d87d790279200d

It seems like wrong branch name

image

Do you mean old-11.8 and old-12.1 ?

image

alpayariyak commented 7 months ago

Yes, that’s the one. My sincere apologies!

hiennef commented 7 months ago

image Still take too much time :(((

alpayariyak commented 7 months ago

Try adding something like this before the build and adjust the values based on your system:

ENV MAX_JOBS=48 # The number of CPU cores you have ENV NVCC_THREADS=1024 # Changes depending on GPU

alpayariyak commented 7 months ago

Try out 0.2.0, you no longer need to compile vLLM, so the build should be very fast.

hiennef commented 7 months ago

I will try! Thanks for your help!