runpod-workers / worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
MIT License
221 stars 86 forks source link

Meta-Llama-3.1-8B support #97

Open klipach opened 1 month ago

klipach commented 1 month ago

I'm trying to build Llama 3.1 and Llama 3.1 Instruct but the build always fails (latest main or v1.2.0). These models are not supported yet?

Llama 3 and Llama 3 Instruct also don't work.

Trying to build an image on MacBook m2

sudo HF_TOKEN="****" docker build -t username/llama-3.1-8b-instruct --secret id=HF_TOKEN --build-arg MODEL_NAME="meta-llama/Meta-Llama-3.1-8B-Instruct" --build-arg BASE_PATH="/models" .

Output:

0.401   Downloading vllm-0.5.4.tar.gz (958 kB)
0.475      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 958.6/958.6 kB 16.7 MB/s eta 0:00:00
0.569   Installing build dependencies: started
12.17   Installing build dependencies: finished with status 'done'
12.17   Getting requirements to build wheel: started
12.95   Getting requirements to build wheel: finished with status 'error'
12.95   error: subprocess-exited-with-error
12.95
12.95   × Getting requirements to build wheel did not run successfully.
12.95   │ exit code: 1
12.95   ╰─> [20 lines of output]
12.95       /tmp/pip-build-env-87fyorwz/overlay/local/lib/python3.10/dist-packages/torch/_subclasses/functional_tensor.py:258: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
12.95         cpu = _conversion_method_template(device=torch.device("cpu"))
12.95       <string>:56: RuntimeWarning: Failed to embed commit hash:
12.95       [Errno 2] No such file or directory: 'git'
12.95       Traceback (most recent call last):
12.95         File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
12.95           main()
12.95         File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
12.95           json_out['return_val'] = hook(**hook_input['kwargs'])
12.95         File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
12.95           return hook(config_settings)
12.95         File "/tmp/pip-build-env-87fyorwz/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 327, in get_requires_for_build_wheel
12.95           return self._get_build_requires(config_settings, requirements=[])
12.95         File "/tmp/pip-build-env-87fyorwz/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 297, in _get_build_requires
12.95           self.run_setup()
12.95         File "/tmp/pip-build-env-87fyorwz/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 313, in run_setup
12.95           exec(code, locals())
12.95         File "<string>", line 458, in <module>
12.95         File "<string>", line 379, in get_vllm_version
12.95       RuntimeError: Unknown runtime environment
12.95       [end of output]
12.95
12.95   note: This error originates from a subprocess, and is likely not a problem with pip.
13.06 error: subprocess-exited-with-error
13.06
13.06 × Getting requirements to build wheel did not run successfully.
13.06 │ exit code: 1
13.06 ╰─> See above for output.
13.06
13.06 note: This error originates from a subprocess, and is likely not a problem with pip.
------
Dockerfile:15
--------------------
  14 |     # Install vLLM (switching back to pip installs since issues that required building fork are fixed and space optimization is not as important since caching) and FlashInfer
  15 | >>> RUN python3 -m pip install vllm==0.5.4 && \
  16 | >>>     python3 -m pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.3
  17 |
--------------------
ERROR: failed to solve: process "/bin/sh -c python3 -m pip install vllm==0.5.4 &&     python3 -m pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.3" did not complete successfully: exit code: 1
raihanfadhilah commented 2 weeks ago

I'm currently also getting this. Any luck on your end?

raihanfadhilah commented 2 weeks ago

I found the solution. You should add --platform linux/amd64 to your build command. It's the fact that we're building it on apple silicon.