runpod-workers / worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
MIT License
220 stars 85 forks source link

HF Model Download get stuck #21

Closed satpalsr closed 8 months ago

satpalsr commented 9 months ago

Around 1-3% the download of model while building docker image get stuck and don't move forward. This happens with different models too and wasn't happening earlier. Outside of this docker image building, I am able to download the models.

sudo docker build -t username/image:tag --build-arg MODEL_NAME="openchat/openchat_3.5" --build-arg MODEL_BASE_PATH="/models" .
alpayariyak commented 8 months ago

Could you try again please?

satpalsr commented 8 months ago

vllm installation is having problem, causing failure to download models.

Traceback (most recent call last):
0.390   File "/download_model.py", line 3, in <module>
0.390     from vllm.model_executor.weight_utils import prepare_hf_model_weights
0.390 ModuleNotFoundError: No module named 'vllm'
------
Dockerfile:40
--------------------
  39 |     ARG QUANTIZATION=""
  40 | >>> RUN if [ -n "$MODEL_NAME" ]; then \
  41 | >>>         export MODEL_BASE_PATH=$MODEL_BASE_PATH && \
  42 | >>>         export MODEL_NAME=$MODEL_NAME && \
  43 | >>>         python3.11 /download_model.py --model $MODEL_NAME; \
  44 | >>>     fi && \
  45 | >>>     if [ -n "$QUANTIZATION" ]; then \
  46 | >>>         export QUANTIZATION=$QUANTIZATION; \
  47 | >>>     fi
  48 |     

This line executes pip install -e git+https://github.com/alpayariyak/vllm.git@cuda-11.8#egg=vllm but exit without installation.

alpayariyak commented 8 months ago

Fixed, we had just migrated our repo to https://github.com/runpod/vllm-fork-for-sls-worker. Try now

satpalsr commented 8 months ago

Same Issue persist with vllm installation.

sudo docker build --no-cache -t username/tag:v0.1 --build-arg MODEL_NAME="openchat/openchat_3.5" --build-arg MODEL_BASE_PATH="/models" .
Obtaining vllm from git+https://github.com/runpod/vllm-fork-for-sls-worker.git@cuda-11.8#egg=vllm
Cloning https://github.com/runpod/vllm-fork-for-sls-worker.git (to revision cuda-11.8) to /src/vllm
Running command git clone --filter=blob:none --quiet https://github.com/runpod/vllm-fork-for-sls-worker.git /src/vllm
Running command git checkout -b cuda-11.8 --track origin/cuda-11.8
...
# note: This error originates from a subprocess, and is likely not a problem with pip.
# error: metagata-generation-failed
alpayariyak commented 8 months ago

Works for me with the exact same command, with cache off and everything

satpalsr commented 8 months ago

I created an AWS instance and run this, I get same listed issues as local.

alpayariyak commented 8 months ago

Fixed with latest version