vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
30.27k stars 4.59k forks source link

[Bug]: Dockerfile Build breaks in local #7497

Open palash-fin opened 3 months ago

palash-fin commented 3 months ago

Your current environment

using docker desktop to build vllm

🐛 Describe the bug

Whenever you will try to build the image using the dockerfile in the repo. the build fails at line 114: && python3 setup.py bdist_wheel --dist-dir=dist --py-limited-api=cp38 \

Error:

.98 /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include) 13.98 /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package) 13.98 CMakeLists.txt:67 (find_package) 13.98 13.98 13.98 -- Added CUDA NVCC flags for: -gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80; -gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_89,code=sm_89;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_90,code=comput e_90 14.00 CMake Warning at /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message): 14.00 static library kineto_LIBRARY-NOTFOUND not found. 14.00 Call Stack (most recent call first): 14.00 /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:120 (append_torchlib_if_found) 14.00 CMakeLists.txt:67 (find_package) 14.00 14.00 14.01 -- Found Torch: /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch.so 14.01 -- Enabling core extension. 14.01 -- CUDA supported arches: 7.0;7.5;8.0;8.6;8.9;9.0 14.01 -- CUDA target arches: 70-real;75-real;80-real;86-real;89-real;90-real;90-virtual 31.95 -- CMake Version: 3.30.2 31.95 -- CUTLASS 3.5.1 31.95 -- CUDART: /usr/local/cuda/lib64/libcudart.so 31.95 -- CUDA Driver: /usr/local/cuda/lib64/stubs/libcuda.so 31.95 -- NVRTC: /usr/local/cuda/lib64/libnvrtc.so 31.95 -- Default Install Location: install 32.09 -- Found Python3: /usr/bin/python3.10 (found suitable version "3.10.14", minimum required is "3.5") found components: Interpreter 32.10 -- Make cute::tuple be the new standard-layout tuple type 32.10 -- CUDA Compilation Architectures: 70;72;75;80;86;87;89;90;90a 32.10 -- Enable caching of reference results in conv unit tests 32.10 -- Enable rigorous conv problem sizes in conv unit tests 32.10 -- Using NVCC flags: --expt-relaxed-constexpr;-DCUTE_USE_PACKED_TUPLE=1;-DCUTLASS_TEST_LEVEL=0;-DCUTLASS_TEST_ENABLE_CACHED_RESULTS=1;-DCU TLASS_CONV_UNIT_TEST_RIGOROUS_SIZE_ENABLED=1;-DCUTLASS_DEBUG_TRACE_LEVEL=0;-Xcompiler=-Wconversion;-Xcompiler=-fno-strict-aliasing;-lineinfo
32.12 fatal: not a git repository (or any of the parent directories): .git 32.12 -- CUTLASS Revision: Unable to detect, Git returned code 128. 32.13 -- Configuring cublas ... 32.13 -- cuBLAS Disabled. 32.13 -- Configuring cuBLAS ... done. 32.16 -- Enabling C extension. 32.16 -- Enabling moe extension. 32.16 -- Configuring done (28.1s) 32.26 -- Generating done (0.1s) 32.26 -- Build files have been written to: /workspace/build/temp.linux-x86_64-cpython-310 32.29 Using MAX_JOBS=2 as the number of jobs. 32.29 Using NVCC_THREADS=8 as the number of nvcc threads. 47.99 [1/33] Building CXX object CMakeFiles/_core_C.dir/csrc/core/torch_bindings.cpp.o 48.64 [2/33] Linking CXX shared module /workspace/build/lib.linux-x86_64-cpython-310/vllm/_core_C.abi3.so 527.1 [3/33] Building CUDA object CMakeFiles/_C.dir/csrc/cache_kernels.cu.o 527.1 FAILED: CMakeFiles/_C.dir/csrc/cache_kernels.cu.o 527.1 ccache /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -DPy_LIMITED_API=3 -DTORCH_EXTENSION_NAME=_C -DUSE_C10D_GLOO -DUSEC10D NCCL -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -D_C_EXPORTS -I/workspace/csrc -I/workspace/build/temp.linux-x86_64-cpython-310/_deps/cutlass- src/include -isystem /usr/include/python3.10 -isystem /usr/local/lib/python3.10/dist-packages/torch/include -isystem /usr/local/lib/python3.10/d ist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/cuda/include -DONNX_NAMESPACE=onnx_c2 -Xcudafe --diag_suppress=ccclobber ignored,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflic t_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended -lambda -O2 -g -DNDEBUG -std=c++17 "--generate-code=arch=compute_70,code=[sm_70]" "--generate-code=arch=compute_75,code=[sm_75]" "--generate-cod e=arch=compute_80,code=[sm_80]" "--generate-code=arch=compute_86,code=[sm_86]" "--generate-code=arch=compute_89,code=[sm_89]" "--generate-code=a rch=compute_90,code=[sm_90]" "--generate-code=arch=compute_90,code=[compute_90]" -Xcompiler=-fPIC --expt-relaxed-constexpr -DENABLE_FP8 --thread s=8 -D_GLIBCXX_USE_CXX11_ABI=0 -MD -MT CMakeFiles/_C.dir/csrc/cache_kernels.cu.o -MF CMakeFiles/_C.dir/csrc/cache_kernels.cu.o.d -x cu -c /works pace/csrc/cache_kernels.cu -o CMakeFiles/_C.dir/csrc/cache_kernels.cu.o 527.1 Killed 527.1 Killed 527.1 Killed 527.1 ninja: build stopped: subcommand failed. 527.9 Traceback (most recent call last): 527.9 File "/workspace/setup.py", line 456, in 527.9 setup( 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/init.py", line 108, in setup 527.9 return distutils.core.setup(**attrs) 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/core.py", line 184, in setup 527.9 return run_commands(dist) 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/core.py", line 200, in run_commands 527.9 dist.run_commands() 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 964, in run_commands 527.9 self.run_command(cmd) 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 945, in run_command 527.9 super().run_command(command) 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 983, in run_command 527.9 cmd_obj.run() 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/command/bdist_wheel.py", line 373, in run 527.9 self.run_command("build") 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/cmd.py", line 316, in run_command 527.9 self.distribution.run_command(command) 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 945, in run_command 527.9 super().run_command(command) 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 983, in run_command 527.9 cmd_obj.run() 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build.py", line 135, in run 527.9 self.run_command(cmd_name) 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/cmd.py", line 316, in run_command 527.9 self.distribution.run_command(command) 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 945, in run_command 527.9 super().run_command(command) 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 983, in run_command 527.9 cmd_obj.run() 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/command/build_ext.py", line 93, in run 527.9 _build_ext.run(self) 527.9 File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py", line 359, in run 527.9 self.build_extensions() 98 | # if USE_SCCACHE is set, use sccache to speed up compilation 99 | >>> RUN --mount=type=cache,target=/root/.cache/pip \ 100 | >>> python3 setup.py bdist_wheel --dist-dir=dist --py-limited-api=cp38 101 | # if [ "$USE_SCCACHE" = "1" ]; then \

ERROR: failed to solve: process "/bin/sh -c python3 setup.py bdist_wheel --dist-dir=dist --py-limited-api=cp38" did not complete successfully: e xit code: 1

View build details: docker-desktop://dashboard/build/desktop-linux/desktop-linux/jchr40clxjkbe0gz7fm9gfhv7 PS D:\vllm> PS D:\vllm>

github-actions[bot] commented 2 days ago

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!