runpod-workers / worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
MIT License
213 stars 82 forks source link

Error after tokenizer commit #42

Closed StableFluffy closed 6 months ago

StableFluffy commented 6 months ago

https://github.com/runpod-workers/worker-vllm/commit/2b5b8dfb61e32d221bc8ce49f98ec74698154a6e

after this commit i can't build my image


docker build -t instructkr/qwen:1.5_72b_chat --build-arg MODEL_NAME="Qwen/Qwen1.5-72B-Chat-AWQ" --build-arg QUANTIZATION="awq" --build-arg MAX_MODEL_LENGTH="2048" --build-arg MODEL_BASE_PATH="/models" .
[+] Building 784.2s (11/12)                                                                  docker:default
 => [internal] load .dockerignore                                                                      0.0s
 => => transferring context: 2B                                                                        0.0s
 => [internal] load build definition from Dockerfile                                                   0.0s
 => => transferring dockerfile: 1.48kB                                                                 0.0s
 => [internal] load metadata for docker.io/runpod/worker-vllm:base-0.2.2-cuda11.8.0                    1.6s
 => [auth] runpod/worker-vllm:pull token for registry-1.docker.io                                      0.0s
 => [vllm-base 1/7] FROM docker.io/runpod/worker-vllm:base-0.2.2-cuda11.8.0@sha256:645d42b84d914a8daa  0.0s
 => [internal] load build context                                                                      0.0s
 => => transferring context: 275B                                                                      0.0s
 => CACHED [vllm-base 2/7] RUN apt-get update -y     && apt-get install -y python3-pip                 0.0s
 => CACHED [vllm-base 3/7] COPY builder/requirements.txt /requirements.txt                             0.0s
 => CACHED [vllm-base 4/7] RUN --mount=type=cache,target=/root/.cache/pip     python3 -m pip install   0.0s
 => CACHED [vllm-base 5/7] COPY builder/download_model.py /download_model.py                           0.0s
 => ERROR [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false     if [ -f /run/secre  782.5s
------
 > [vllm-base 6/7] RUN --mount=type=secret,id=HF_TOKEN,required=false     if [ -f /run/secrets/HF_TOKEN ]; then         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN);     fi &&     if [ -n "Qwen/Qwen1.5-72B-Chat-AWQ" ]; then         python3 /download_model.py;     fi:
2.724 INFO 02-07 07:13:43 weight_utils.py:164] Using model weights format ['*.safetensors']
model-00008-of-00011.safetensors: 100%|██████████| 3.98G/3.98G [09:14<00:00, 7.17MB/s]
model-00007-of-00011.safetensors: 100%|██████████| 3.94G/3.94G [09:30<00:00, 6.90MB/s]
model-00006-of-00011.safetensors: 100%|██████████| 3.98G/3.98G [09:45<00:00, 6.79MB/s]
model-00003-of-00011.safetensors: 100%|██████████| 3.94G/3.94G [09:55<00:00, 6.62MB/s]
model-00004-of-00011.safetensors: 100%|██████████| 3.98G/3.98G [09:57<00:00, 6.65MB/s]
model-00001-of-00011.safetensors: 100%|██████████| 3.99G/3.99G [10:02<00:00, 6.63MB/s]
model-00005-of-00011.safetensors: 100%|██████████| 3.98G/3.98G [10:18<00:00, 6.42MB/s]
model-00002-of-00011.safetensors: 100%|██████████| 3.98G/3.98G [10:35<00:00, 6.25MB/s]
model-00011-of-00011.safetensors: 100%|██████████| 2.49G/2.49G [02:24<00:00, 17.3MB/s]
model-00010-of-00011.safetensors: 100%|██████████| 3.03G/3.03G [02:57<00:00, 17.1MB/s]
model-00009-of-00011.safetensors: 100%|██████████| 3.98G/3.98G [03:38<00:00, 18.2MB/s]
config.json: 100%|██████████| 841/841 [00:00<00:00, 6.97MB/s]G [03:38<00:00, 27.7MB/s]
generation_config.json: 100%|██████████| 217/217 [00:00<00:00, 1.53MB/s]:00, 25.3MB/s]
quant_config.json: 100%|██████████| 126/126 [00:00<00:00, 826kB/s]49<02:06, 15.1MB/s]
model.safetensors.index.json: 100%|██████████| 179k/179k [00:00<00:00, 484kB/s].5MB/s]
tokenizer_config.json: 100%|██████████| 1.41k/1.41k [00:00<00:00, 8.28MB/s]
vocab.json: 100%|██████████| 2.78M/2.78M [00:00<00:00, 6.78MB/s]
tokenizer.json: 100%|██████████| 7.03M/7.03M [00:01<00:00, 5.28MB/s]
781.7 Traceback (most recent call last):
781.7   File "/download_model.py", line 48, in <module>
781.7     tokenizer_folder = download_extras_or_tokenizer(tokenizer, download_dir, revisions["tokenizer"])
781.7   File "/download_model.py", line 10, in download_extras_or_tokenizer
781.7     folder = snapshot_download(
781.7   File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 110, in _inner_fn
781.7     validate_repo_id(arg_value)
781.7   File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 164, in validate_repo_id
781.7     raise HFValidationError(
781.7 huggingface_hub.utils._validators.HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: ''.
------
Dockerfile:35
--------------------
  34 |     COPY builder/download_model.py /download_model.py
  35 | >>> RUN --mount=type=secret,id=HF_TOKEN,required=false \
  36 | >>>     if [ -f /run/secrets/HF_TOKEN ]; then \
  37 | >>>         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN); \
  38 | >>>     fi && \
  39 | >>>     if [ -n "$MODEL_NAME" ]; then \
  40 | >>>         python3 /download_model.py; \
  41 | >>>     fi
  42 |
--------------------
ERROR: failed to solve: process "/bin/sh -c if [ -f /run/secrets/HF_TOKEN ]; then         export HF_TOKEN=$(cat /run/secrets/HF_TOKEN);     fi &&     if [ -n \"$MODEL_NAME\" ]; then         python3 /download_model.py;     fi" did not complete successfully: exit code: 1
alpayariyak commented 6 months ago

Should be fixed now