vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
25.73k stars 3.75k forks source link

[Installation]: Couldn't find CUDA library root. #6134

Open CodexDive opened 1 month ago

CodexDive commented 1 month ago

Your current environment

The output of `python collect_env.py`

How you are installing vllm

I install vLLM using Souce code.

pip install -e .

but encounter an error: Couldn't find CUDA library root. which is likely caused by the cuda environment.

CodexDive commented 1 month ago

Building wheels for collected packages: vllm Building editable for vllm (pyproject.toml) ... error error: subprocess-exited-with-error

× Building editable for vllm (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [139 lines of output] running editable_wheel creating /tmp/pip-wheel-dgmr7pu3/.tmp-i2ltqo49/vllm.egg-info writing /tmp/pip-wheel-dgmr7pu3/.tmp-i2ltqo49/vllm.egg-info/PKG-INFO writing dependency_links to /tmp/pip-wheel-dgmr7pu3/.tmp-i2ltqo49/vllm.egg-info/dependency_links.txt writing requirements to /tmp/pip-wheel-dgmr7pu3/.tmp-i2ltqo49/vllm.egg-info/requires.txt writing top-level names to /tmp/pip-wheel-dgmr7pu3/.tmp-i2ltqo49/vllm.egg-info/top_level.txt writing manifest file '/tmp/pip-wheel-dgmr7pu3/.tmp-i2ltqo49/vllm.egg-info/SOURCES.txt' reading manifest file '/tmp/pip-wheel-dgmr7pu3/.tmp-i2ltqo49/vllm.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' adding license file 'LICENSE' writing manifest file '/tmp/pip-wheel-dgmr7pu3/.tmp-i2ltqo49/vllm.egg-info/SOURCES.txt' creating '/tmp/pip-wheel-dgmr7pu3/.tmp-i2ltqo49/vllm-0.4.2+cu120.dist-info' creating /tmp/pip-wheel-dgmr7pu3/.tmp-i2ltqo49/vllm-0.4.2+cu120.dist-info/WHEEL running build_py running build_ext -- The CXX compiler identification is GNU 9.4.0 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Build type: RelWithDebInfo -- Target device: cuda -- Found Python: /home/yuzailiang/anaconda3/envs/llms_inference/bin/python3.9 (found version "3.9.19") found components: Interpreter Development.Module -- Found python matching: /home/yuzailiang/anaconda3/envs/llms_inference/bin/python3.9. -- Found CUDA: /usr/local/cuda-12.0 (found version "12.0") CMake Error at /tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/cmake/data/share/cmake-3.30/Modules/Internal/CMakeCUDAFindToolkit.cmake:148 (message): Couldn't find CUDA library root. Call Stack (most recent call first): /tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/cmake/data/share/cmake-3.30/Modules/CMakeDetermineCUDACompiler.cmake:85 (cmake_cuda_find_toolkit) /tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:47 (enable_language) /tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:87 (include) /tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package) CMakeLists.txt:67 (find_package)

  -- Configuring incomplete, errors occurred!
  Traceback (most recent call last):
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 153, in run
      self._create_wheel_file(bdist_wheel)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 355, in _create_wheel_file
      files, mapping = self._run_build_commands(dist_name, unpacked, lib, tmp)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 278, in _run_build_commands
      self._run_build_subcommands()
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 305, in _run_build_subcommands
      self.run_command(name)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/dist.py", line 974, in run_command
      super().run_command(command)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 989, in run_command
      cmd_obj.run()
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/command/build_ext.py", line 93, in run
      _build_ext.run(self)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
      self.build_extensions()
    File "<string>", line 192, in build_extensions
    File "<string>", line 175, in configure
    File "/home/yuzailiang/anaconda3/envs/llms_inference/lib/python3.9/subprocess.py", line 373, in check_call
      raise CalledProcessError(retcode, cmd)
  subprocess.CalledProcessError: Command '['cmake', '/mnt/self-define/sunning/lmdeploy/LLMs_Inference', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/tmp/tmp3jv40t2d.build-lib/vllm', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=/tmp/tmpdbbjly8m.build-temp', '-DVLLM_TARGET_DEVICE=cuda', '-DCMAKE_CXX_COMPILER_LAUNCHER=ccache', '-DCMAKE_CUDA_COMPILER_LAUNCHER=ccache', '-DVLLM_PYTHON_EXECUTABLE=/home/yuzailiang/anaconda3/envs/llms_inference/bin/python3.9', '-DNVCC_THREADS=1', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=256']' returned non-zero exit status 1.
  /tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py:989: _DebuggingTips: Problem in editable installation.
  !!

          ********************************************************************************
          An error happened while installing `vllm` in editable mode.

          The following steps are recommended to help debug this problem:

          - Try to install the project normally, without using the editable mode.
            Does the error still persist?
            (If it does, try fixing the problem before attempting the editable mode).
          - If you are using binary extensions, make sure you have all OS-level
            dependencies installed (e.g. compilers, toolchains, binary libraries, ...).
          - Try the latest version of setuptools (maybe the error was already fixed).
          - If you (or your project dependencies) are using any setuptools extension
            or customization, make sure they support the editable mode.

          After following the steps above, if the problem still persists and
          you think this is related to how setuptools handles editable installations,
          please submit a reproducible example
          (see https://stackoverflow.com/help/minimal-reproducible-example) to:

              https://github.com/pypa/setuptools/issues

          See https://setuptools.pypa.io/en/latest/userguide/development_mode.html for details.
          ********************************************************************************

  !!
    cmd_obj.run()
  Traceback (most recent call last):
    File "/home/yuzailiang/anaconda3/envs/llms_inference/lib/python3.9/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
      main()
    File "/home/yuzailiang/anaconda3/envs/llms_inference/lib/python3.9/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
    File "/home/yuzailiang/anaconda3/envs/llms_inference/lib/python3.9/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 273, in build_editable
      return hook(wheel_directory, config_settings, metadata_directory)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 453, in build_editable
      return self._build_with_temp_dir(
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 397, in _build_with_temp_dir
      self.run_setup()
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 313, in run_setup
      exec(code, locals())
    File "<string>", line 401, in <module>
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/__init__.py", line 103, in setup
      return distutils.core.setup(**attrs)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 184, in setup
      return run_commands(dist)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
      dist.run_commands()
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 970, in run_commands
      self.run_command(cmd)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/dist.py", line 974, in run_command
      super().run_command(command)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 989, in run_command
      cmd_obj.run()
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 153, in run
      self._create_wheel_file(bdist_wheel)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 355, in _create_wheel_file
      files, mapping = self._run_build_commands(dist_name, unpacked, lib, tmp)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 278, in _run_build_commands
      self._run_build_subcommands()
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 305, in _run_build_subcommands
      self.run_command(name)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/dist.py", line 974, in run_command
      super().run_command(command)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 989, in run_command
      cmd_obj.run()
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/command/build_ext.py", line 93, in run
      _build_ext.run(self)
    File "/tmp/pip-build-env-dj1l8ntc/overlay/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
      self.build_extensions()
    File "<string>", line 192, in build_extensions
    File "<string>", line 175, in configure
    File "/home/yuzailiang/anaconda3/envs/llms_inference/lib/python3.9/subprocess.py", line 373, in check_call
      raise CalledProcessError(retcode, cmd)
  subprocess.CalledProcessError: Command '['cmake', '/mnt/self-define/sunning/lmdeploy/LLMs_Inference', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/tmp/tmp3jv40t2d.build-lib/vllm', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=/tmp/tmpdbbjly8m.build-temp', '-DVLLM_TARGET_DEVICE=cuda', '-DCMAKE_CXX_COMPILER_LAUNCHER=ccache', '-DCMAKE_CUDA_COMPILER_LAUNCHER=ccache', '-DVLLM_PYTHON_EXECUTABLE=/home/yuzailiang/anaconda3/envs/llms_inference/bin/python3.9', '-DNVCC_THREADS=1', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=256']' returned non-zero exit status 1.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building editable for vllm Failed to build vllm ERROR: Could not build wheels for vllm, which is required to install pyproject.toml-based projects

CodexDive commented 1 month ago

(llms_inference) yuzailiang@ubuntu:/mnt/self-define/sunning/lmdeploy/LLMs_Inference$ python collect_env.py Collecting environment information... PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.6 LTS (x86_64) GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 Clang version: Could not collect CMake version: version 3.30.0 Libc version: glibc-2.31

Python version: 3.9.19 | packaged by conda-forge | (main, Mar 20 2024, 12:50:21) [GCC 12.3.0] (64-bit runtime) Python platform: Linux-5.4.0-187-generic-x86_64-with-glibc2.31 Is CUDA available: True CUDA runtime version: 12.0.140 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: NVIDIA A100-SXM4-40GB GPU 1: NVIDIA A100-SXM4-40GB GPU 2: NVIDIA A100-SXM4-40GB GPU 3: NVIDIA A100-SXM4-40GB GPU 4: NVIDIA A100-SXM4-40GB GPU 5: NVIDIA A100-SXM4-40GB GPU 6: NVIDIA A100-SXM4-40GB GPU 7: NVIDIA A100-SXM4-40GB

Nvidia driver version: 550.54.14 cuDNN version: Probably one of the following: /etc/alternatives/libcudnn_cnn_infer_so /etc/alternatives/libcudnn_cnn_train_so /etc/alternatives/libcudnn_ops_infer_so /etc/alternatives/libcudnn_ops_train_so /usr/lib/x86_64-linux-gnu/libcudnn.so.9.0.0 /usr/lib/x86_64-linux-gnu/libcudnn_adv.so.9.0.0 /usr/lib/x86_64-linux-gnu/libcudnn_cnn.so.9.0.0 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8 /usr/lib/x86_64-linux-gnu/libcudnn_engines_precompiled.so.9.0.0 /usr/lib/x86_64-linux-gnu/libcudnn_engines_runtime_compiled.so.9.0.0 /usr/lib/x86_64-linux-gnu/libcudnn_graph.so.9.0.0 /usr/lib/x86_64-linux-gnu/libcudnn_heuristic.so.9.0.0 /usr/lib/x86_64-linux-gnu/libcudnn_ops.so.9.0.0 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8 /usr/local/cuda-12.0/targets/x86_64-linux/lib/libcudnn.so.8.9.7 /usr/local/cuda-12.0/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8.9.7 /usr/local/cuda-12.0/targets/x86_64-linux/lib/libcudnn_adv_train.so.8.9.7 /usr/local/cuda-12.0/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8.9.7 /usr/local/cuda-12.0/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8.9.7 /usr/local/cuda-12.0/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8.9.7 /usr/local/cuda-12.0/targets/x86_64-linux/lib/libcudnn_ops_train.so.8.9.7 HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 43 bits physical, 48 bits virtual CPU(s): 256 On-line CPU(s) list: 0-255 Thread(s) per core: 2 Core(s) per socket: 64 Socket(s): 2 NUMA node(s): 2 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7742 64-Core Processor Stepping: 0 Frequency boost: enabled CPU MHz: 1493.806 CPU max MHz: 2250.0000 CPU min MHz: 1500.0000 BogoMIPS: 4491.81 Virtualization: AMD-V L1d cache: 4 MiB L1i cache: 4 MiB L2 cache: 64 MiB L3 cache: 512 MiB NUMA node0 CPU(s): 0-63,128-191 NUMA node1 CPU(s): 64-127,192-255 Vulnerability Gather data sampling: Not affected Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Vulnerable Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines; IBPB conditional; IBRS_FW; STIBP conditional; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca sme sev sev_es

Versions of relevant libraries: [pip3] numpy==1.26.3 [pip3] nvidia-nccl-cu12==2.20.5 [pip3] torch==2.3.0+cu121 [pip3] torchaudio==2.3.0+cu121 [pip3] torchvision==0.18.0+cu121 [pip3] triton==2.3.0 [pip3] vllm_nccl_cu12==2.18.1.0.4.0 [conda] numpy 1.26.3 pypi_0 pypi [conda] nvidia-nccl-cu12 2.20.5 pypi_0 pypi [conda] torch 2.3.0+cu121 pypi_0 pypi [conda] torchaudio 2.3.0+cu121 pypi_0 pypi [conda] torchvision 0.18.0+cu121 pypi_0 pypi [conda] triton 2.3.0 pypi_0 pypi [conda] vllm-nccl-cu12 2.18.1.0.4.0 pypi_0 pypiROCM Version: Could not collect Neuron SDK Version: N/A vLLM Version: 0.4.2 vLLM Build Flags: CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled GPU Topology: GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 NIC0 NIC1 CPU Affinity NUMA Affinity GPU NUMA ID GPU0 X NV12 NV12 NV12 NV12 NV12 NV12 NV12 PXB SYS 0-63,128-191 0 N/A GPU1 NV12 X NV12 NV12 NV12 NV12 NV12 NV12 PXB SYS 0-63,128-191 0 N/A GPU2 NV12 NV12 X NV12 NV12 NV12 NV12 NV12 SYS SYS 0-63,128-191 0 N/A GPU3 NV12 NV12 NV12 X NV12 NV12 NV12 NV12 SYS SYS 0-63,128-191 0 N/A GPU4 NV12 NV12 NV12 NV12 X NV12 NV12 NV12 SYS SYS 64-127,192-255 1 N/A GPU5 NV12 NV12 NV12 NV12 NV12 X NV12 NV12 SYS SYS 64-127,192-255 1 N/A GPU6 NV12 NV12 NV12 NV12 NV12 NV12 X NV12 SYS SYS 64-127,192-255 1 N/A GPU7 NV12 NV12 NV12 NV12 NV12 NV12 NV12 X SYS SYS 64-127,192-255 1 N/A NIC0 PXB PXB SYS SYS SYS SYS SYS SYS X SYS
NIC1 SYS SYS SYS SYS SYS SYS SYS SYS SYS X

Legend:

X = Self SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI) NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU) PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge) PIX = Connection traversing at most a single PCIe bridge NV# = Connection traversing a bonded set of # NVLinks

NIC Legend:

NIC0: mlx4_0 NIC1: mlx5_0

conwayz commented 1 month ago

interesting, it can find cuda but not cuda library root.. can you share the value of these environment variables PATH, CPATH, LD_LIBRARY_PATH, LIBRARY_PATH ?