vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
29.96k stars 4.53k forks source link

[Installation]: vllm CPU mode build failed #8710

Closed abcfy2 closed 1 month ago

abcfy2 commented 1 month ago

Your current environment

Collecting environment information...
WARNING 09-22 20:25:14 _custom_ops.py:18] Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C'")
INFO 09-22 20:25:14 importing.py:10] Triton not installed; certain GPU-related functions will not be available.
PyTorch version: 2.4.0+cpu
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: EndeavourOS Linux (x86_64)
GCC version: (GCC) 14.2.1 20240910
Clang version: 18.1.8
CMake version: version 3.30.3
Libc version: glibc-2.40

Python version: 3.12.6 (main, Sep  8 2024, 13:18:56) [GCC 14.2.1 20240805] (64-bit runtime)
Python platform: Linux-6.10.10-zen1-1-zen-x86_64-with-glibc2.40
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                         x86_64
CPU op-mode(s):                       32-bit, 64-bit
Address sizes:                        43 bits physical, 48 bits virtual
Byte Order:                           Little Endian
CPU(s):                               16
On-line CPU(s) list:                  0-15
Vendor ID:                            AuthenticAMD
Model name:                           AMD Ryzen 7 3700X 8-Core Processor
CPU family:                           23
Model:                                113
Thread(s) per core:                   2
Core(s) per socket:                   8
Socket(s):                            1
Stepping:                             0
Frequency boost:                      disabled
CPU(s) scaling MHz:                   72%
CPU max MHz:                          4979.4429
CPU min MHz:                          2200.0000
BogoMIPS:                             8099.81
Flags:                                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip rdpid overflow_recov succor smca sev sev_es
Virtualization:                       AMD-V
L1d cache:                            256 KiB (8 instances)
L1i cache:                            256 KiB (8 instances)
L2 cache:                             4 MiB (8 instances)
L3 cache:                             32 MiB (2 instances)
NUMA node(s):                         1
NUMA node0 CPU(s):                    0-15
Vulnerability Gather data sampling:   Not affected
Vulnerability Itlb multihit:          Not affected
Vulnerability L1tf:                   Not affected
Vulnerability Mds:                    Not affected
Vulnerability Meltdown:               Not affected
Vulnerability Mmio stale data:        Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed:               Mitigation; untrained return thunk; SMT enabled with STIBP protection
Vulnerability Spec rstack overflow:   Mitigation; Safe RET
Vulnerability Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:             Mitigation; Retpolines; IBPB conditional; STIBP always-on; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Vulnerability Srbds:                  Not affected
Vulnerability Tsx async abort:        Not affected

Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] pyzmq==26.2.0
[pip3] torch==2.4.0+cpu
[pip3] torchvision==0.19.0+cpu
[pip3] transformers==4.44.2
[conda] Could not collect
ROCM Version: Could not collect
Neuron SDK Version: N/A
vLLM Version: 0.6.1.post2@0e40ac9b7b5d953dfe38933bc7d2fb0a6c8da53c
vLLM Build Flags:
CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled
GPU Topology:
============================ ROCm System Management Interface ============================
================================ Weight between two GPUs =================================
       GPU0         
GPU0   0            

================================= Hops between two GPUs ==================================
       GPU0         
GPU0   0            

=============================== Link Type between two GPUs ===============================
       GPU0         
GPU0   0            

======================================= Numa Nodes =======================================
GPU[0]      : (Topology) Numa Node: 0
GPU[0]      : (Topology) Numa Affinity: -1
================================== End of ROCm SMI Log ===================================

How you are installing vllm

Follow the CPU installation instructions: https://docs.vllm.ai/en/latest/getting_started/cpu-installation.html

VLLM_TARGET_DEVICE=cpu python setup.py install

Error:

...
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=8192,device_name=NVIDIA_H100_80GB_HBM3,dtype=fp8_w8a8.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
running build_ext
-- The CXX compiler identification is GNU 14.2.1
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Build type: RelWithDebInfo
-- Target device: cpu
-- Found Python: /home/fengyu/projects/vllm/venv/bin/python (found version "3.12.6") found components: Interpreter Development.Module Development.SABIModule
-- Found python matching: /home/fengyu/projects/vllm/venv/bin/python.
CMake Warning at venv/lib/python3.12/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
  static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
  venv/lib/python3.12/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:120 (append_torchlib_if_found)
  CMakeLists.txt:84 (find_package)

-- Found Torch: /home/fengyu/projects/vllm/venv/lib/python3.12/site-packages/torch/lib/libtorch.so
-- Enabling core extension.
CMake Warning at cmake/cpu_extension.cmake:73 (message):
  vLLM CPU backend using AVX2 ISA
Call Stack (most recent call first):
  CMakeLists.txt:110 (include)

-- CPU extension compile flags: -fopenmp;-DVLLM_CPU_EXTENSION;-mavx2
-- Enabling C extension.
CMake Error at cmake/cpu_extension.cmake:123 (add_dependencies):
  Cannot add target-level dependencies to non-existent target "default".

  The add_dependencies works for top-level logical targets created by the
  add_executable, add_library, or add_custom_target commands.  If you want to
  add file-level dependencies see the DEPENDS option of the add_custom_target
  and add_custom_command commands.
Call Stack (most recent call first):
  CMakeLists.txt:110 (include)

-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
  File "/home/fengyu/projects/vllm/setup.py", line 520, in <module>
    setup(
  File "/home/fengyu/projects/vllm/venv/lib/python3.12/site-packages/setuptools/__init__.py", line 117, in setup
    return distutils.core.setup(**attrs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/fengyu/projects/vllm/venv/lib/python3.12/site-packages/setuptools/_distutils/core.py", line 183, in setup
    return run_commands(dist)
           ^^^^^^^^^^^^^^^^^^
  File "/home/fengyu/projects/vllm/venv/lib/python3.12/site-packages/setuptools/_distutils/core.py", line 199, in run_commands
    dist.run_commands()
  File "/home/fengyu/projects/vllm/venv/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 954, in run_commands
    self.run_command(cmd)
  File "/home/fengyu/projects/vllm/venv/lib/python3.12/site-packages/setuptools/dist.py", line 950, in run_command
    super().run_command(command)
  File "/home/fengyu/projects/vllm/venv/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
    cmd_obj.run()
  File "/home/fengyu/projects/vllm/venv/lib/python3.12/site-packages/setuptools/_distutils/command/build.py", line 135, in run
    self.run_command(cmd_name)
  File "/home/fengyu/projects/vllm/venv/lib/python3.12/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
    self.distribution.run_command(command)
  File "/home/fengyu/projects/vllm/venv/lib/python3.12/site-packages/setuptools/dist.py", line 950, in run_command
    super().run_command(command)
  File "/home/fengyu/projects/vllm/venv/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
    cmd_obj.run()
  File "/home/fengyu/projects/vllm/setup.py", line 263, in run
    super().run()
  File "/home/fengyu/projects/vllm/venv/lib/python3.12/site-packages/setuptools/command/build_ext.py", line 98, in run
    _build_ext.run(self)
  File "/home/fengyu/projects/vllm/venv/lib/python3.12/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
    self.build_extensions()
  File "/home/fengyu/projects/vllm/setup.py", line 225, in build_extensions
    self.configure(ext)
  File "/home/fengyu/projects/vllm/setup.py", line 205, in configure
    subprocess.check_call(
  File "/usr/lib/python3.12/subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '/home/fengyu/projects/vllm', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DVLLM_TARGET_DEVICE=cpu', '-DCMAKE_C_COMPILER_LAUNCHER=ccache', '-DCMAKE_CXX_COMPILER_LAUNCHER=ccache', '-DCMAKE_CUDA_COMPILER_LAUNCHER=ccache', '-DCMAKE_HIP_COMPILER_LAUNCHER=ccache', '-DVLLM_PYTHON_EXECUTABLE=/home/fengyu/projects/vllm/venv/bin/python', '-DVLLM_PYTHON_PATH=/home/fengyu/projects/vllm:/usr/lib/python312.zip:/usr/lib/python3.12:/usr/lib/python3.12/lib-dynload:/home/fengyu/projects/vllm/venv/lib/python3.12/site-packages:/home/fengyu/projects/vllm/venv/lib/python3.12/site-packages/setuptools/_vendor']' returned non-zero exit status 1.

And for Docker will be the same error.

Before submitting a new issue...

charlesxsh commented 1 month ago

cmake/cpu_extension.cmake

add_custom_target(default)                          <--- add this
message(STATUS "Enabling C extension.")
add_dependencies(default _C)
ProExpertProg commented 1 month ago

Thanks for reporting this issue!