runpod-workers / worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
MIT License
238 stars 96 forks source link

Errors when building the image [Building on MACOS] #25

Closed antonioglass closed 9 months ago

antonioglass commented 10 months ago

I'm building the image with WORKER_CUDA_VERSION=12.1 on an M1 Mac using command docker buildx build -t antonioglass/worker-vllm-new:1.0.0 . --platform linux/amd64 and getting errors. See below.

961.7 Building wheels for collected packages: vllm, quantile-python
961.7   Building editable for vllm (pyproject.toml): started
1311.7   Building editable for vllm (pyproject.toml): still running...
1639.3   Building editable for vllm (pyproject.toml): still running...
1807.1   Building editable for vllm (pyproject.toml): still running...
1876.1   Building editable for vllm (pyproject.toml): still running...
2254.9   Building editable for vllm (pyproject.toml): still running...
2589.2   Building editable for vllm (pyproject.toml): still running...
2626.7   Building editable for vllm (pyproject.toml): finished with status 'error'
2626.9   error: subprocess-exited-with-error
2626.9   
2626.9   × Building editable for vllm (pyproject.toml) did not run successfully.
2626.9   │ exit code: -9
2626.9   ╰─> [87 lines of output]
2626.9       /tmp/pip-build-env-0m__bivn/overlay/local/lib/python3.11/dist-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
2626.9         device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
2626.9       No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
2626.9       running editable_wheel
2626.9       creating /tmp/pip-wheel-jz3moxtu/.tmp-h6a0jx2b/vllm.egg-info
2626.9       writing /tmp/pip-wheel-jz3moxtu/.tmp-h6a0jx2b/vllm.egg-info/PKG-INFO
2626.9       writing dependency_links to /tmp/pip-wheel-jz3moxtu/.tmp-h6a0jx2b/vllm.egg-info/dependency_links.txt
2626.9       writing requirements to /tmp/pip-wheel-jz3moxtu/.tmp-h6a0jx2b/vllm.egg-info/requires.txt
2626.9       writing top-level names to /tmp/pip-wheel-jz3moxtu/.tmp-h6a0jx2b/vllm.egg-info/top_level.txt
2626.9       writing manifest file '/tmp/pip-wheel-jz3moxtu/.tmp-h6a0jx2b/vllm.egg-info/SOURCES.txt'
2626.9       reading manifest file '/tmp/pip-wheel-jz3moxtu/.tmp-h6a0jx2b/vllm.egg-info/SOURCES.txt'
2626.9       reading manifest template 'MANIFEST.in'
2626.9       adding license file 'LICENSE'
2626.9       writing manifest file '/tmp/pip-wheel-jz3moxtu/.tmp-h6a0jx2b/vllm.egg-info/SOURCES.txt'
2626.9       creating '/tmp/pip-wheel-jz3moxtu/.tmp-h6a0jx2b/vllm-0.2.6.dist-info'
2626.9       creating /tmp/pip-wheel-jz3moxtu/.tmp-h6a0jx2b/vllm-0.2.6.dist-info/WHEEL
2626.9       running build_py
2626.9       running build_ext
2626.9       /tmp/pip-build-env-0m__bivn/overlay/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py:424: UserWarning: There are no x86_64-linux-gnu-g++ version bounds defined for CUDA version 12.1
2626.9         warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
alpayariyak commented 10 months ago

This is due to building on a machine with no GPUs. We're looking for workarounds now

alpayariyak commented 9 months ago

In the latest version, we have changed the base image to one that already has vLLM compiled, which solves this problem.