Open inst32i opened 1 month ago
same issue here ...
same issure here too
same issure here too
Same here
Same here as well.
Same here too
Current Behavior
I run the following: CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --verbose
an error occured: ERROR: Failed building wheel for llama-cpp-python
Environment and Context
- Physical hardware: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Vendor ID: GenuineIntel Model name: Intel Xeon Processor (Skylake, IBRS) CPU family: 6 Model: 85 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 16 Stepping: 4 BogoMIPS: 4389.68 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopol ogy cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3d nowprefetch invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt cl wb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 arat pku ospke avx512_vnni md_clear Virtualization features: Hypervisor vendor: KVM Virtualization type: full Caches (sum of all): L1d: 512 KiB (16 instances) L1i: 512 KiB (16 instances) L2: 64 MiB (16 instances) L3: 256 MiB (16 instances) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-15 Vulnerabilities: Itlb multihit: KVM: Mitigation: VMX unsupported L1tf: Mitigation; PTE Inversion Mds: Mitigation; Clear CPU buffers; SMT Host state unknown Meltdown: Mitigation; PTI Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown Retbleed: Mitigation; IBRS Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Mitigation; IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS Not affected Srbds: Not affected Tsx async abort: Mitigation; Clear CPU buffers; SMT Host state unknown
- Operating System: ubuntu1~22.04
- SDK version:
$ python3 --3.11 $ make --4.3 $ g++ --11.4.0
Failure Information (for bugs)
... FAILED: vendor/llama.cpp/examples/llava/llama-llava-cli : && /usr/bin/g++ -pthread -B /mnt/x_env/compiler_compat -O3 -DNDEBUG vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-llava-cli.dir/llava-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-llava-cli -Wl,-rpath,/tmp/tmp6bws6ysg/build/vendor/llama.cpp/src:/tmp/tmp6bws6ysg/build/vendor/llama.cpp/ggml/src: vendor/llama.cpp/common/libcommon.a vendor/llama.cpp/src/libllama.so vendor/llama.cpp/ggml/src/libggml.so && : /mnt/x_env/compiler_compat/ld: warning: libcuda.so.1, needed by vendor/llama.cpp/ggml/src/libggml.so, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: warning: libgomp.so.1, needed by vendor/llama.cpp/ggml/src/libggml.so, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: warning: libdl.so.2, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: warning: libpthread.so.0, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: warning: librt.so.1, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemCreate' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
GOMP_barrier@GOMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference tocuMemAddressReserve' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemUnmap' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference toGOMP_parallel@GOMP_4.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemSetAccess' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference tocuDeviceGet' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
omp_get_thread_num@OMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference tocuMemAddressFree' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuGetErrorString' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference toGOMP_single_start@GOMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuDeviceGetAttribute' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference tocuMemMap' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemRelease' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference toomp_get_num_threads@OMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemGetAllocationGranularity' collect2: error: ld returned 1 exit status ninja: build stopped: subcommand failed.*** CMake build failed error: subprocess-exited-with-error
× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> See above for output.
Steps to Reproduce
- conda activate
- CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --verbose
Nvidia Driver Version: 550.54.14 Cuda Tookit Verion: V12.4.99
I solved this problem. This happend when Cuda version is different with Cuda toolkit version.
You need to check
cuda-version with nvidia-smi
and check cuda-toolkit version wih conda list | grep cuda-toolkit
My version were 12.2 , 11.8
Same here.
Installation worked fine with CMAKE_ARGS="-DLLAMA_CUBLAS=on"
for llama-cpp-python <= 2.79.0.
I now get the same error as OP for llama-cpp-python >= 2.80.0, whether I use CMAKE_ARGS="-DLLAMA_CUBLAS=on"
or CMAKE_ARGS="-DGGML_CUDA=on"
same issure here too in WSL2
same issue here too, WSL2 on Windows 10.
same issue here
I found a workaround to fix this issue:
pyproject.toml
with the following content# [build-system]
# requires = ["scikit-build-core[pyproject]>=0.9.2"]
# build-backend = "scikit_build_core.build"
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
[project]
name = "llama_cpp_python"
dynamic = ["version"]
description = "Python bindings for the llama.cpp library"
readme = "README.md"
license = { text = "MIT" }
authors = [
{ name = "Andrei Betlen", email = "abetlen@gmail.com" },
]
dependencies = [
"typing-extensions>=4.5.0",
"numpy>=1.20.0",
"diskcache>=5.6.1",
"jinja2>=2.11.3",
]
requires-python = ">=3.8"
classifiers = [
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
]
[project.optional-dependencies]
server = [
"uvicorn>=0.22.0",
"fastapi>=0.100.0",
"pydantic-settings>=2.0.1",
"sse-starlette>=1.6.1",
"starlette-context>=0.3.6,<0.4",
"PyYAML>=5.1",
]
test = [
"pytest>=7.4.0",
"httpx>=0.24.1",
"scipy>=1.10",
]
dev = [
"black>=23.3.0",
"twine>=4.0.2",
"mkdocs>=1.4.3",
"mkdocstrings[python]>=0.22.0",
"mkdocs-material>=9.1.18",
"pytest>=7.4.0",
"httpx>=0.24.1",
]
all = [
"llama_cpp_python[server,test,dev]",
]
# [tool.scikit-build]
# wheel.packages = ["llama_cpp"]
# cmake.verbose = true
# cmake.minimum-version = "3.21"
# minimum-version = "0.5.1"
# sdist.include = [".git", "vendor/llama.cpp/*"]
[tool.setuptools.packages.find]
include = ["llama_cpp"]
[tool.setuptools.package-data]
"llama_cpp" = ["lib/*"]
[tool.scikit-build.metadata.version]
provider = "scikit_build_core.metadata.regex"
input = "llama_cpp/__init__.py"
[project.urls]
Homepage = "https://github.com/abetlen/llama-cpp-python"
Issues = "https://github.com/abetlen/llama-cpp-python/issues"
Documentation = "https://llama-cpp-python.readthedocs.io/en/latest/"
Changelog = "https://llama-cpp-python.readthedocs.io/en/latest/changelog/"
[tool.pytest.ini_options]
testpaths = "tests"
pip install . --verbose
Adding the path to libcuda.so
to the LD_LIBRARY_PATH
environment variable allows the examples to link so that the build can succeed.
Current Behavior
I run the following: CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --verbose
an error occured: ERROR: Failed building wheel for llama-cpp-python
Environment and Context
Physical hardware: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Vendor ID: GenuineIntel Model name: Intel Xeon Processor (Skylake, IBRS) CPU family: 6 Model: 85 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 16 Stepping: 4 BogoMIPS: 4389.68 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopol ogy cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3d nowprefetch invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt cl wb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 arat pku ospke avx512_vnni md_clear Virtualization features: Hypervisor vendor: KVM Virtualization type: full Caches (sum of all): L1d: 512 KiB (16 instances) L1i: 512 KiB (16 instances) L2: 64 MiB (16 instances) L3: 256 MiB (16 instances) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-15 Vulnerabilities: Itlb multihit: KVM: Mitigation: VMX unsupported L1tf: Mitigation; PTE Inversion Mds: Mitigation; Clear CPU buffers; SMT Host state unknown Meltdown: Mitigation; PTI Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown Retbleed: Mitigation; IBRS Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Mitigation; IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS Not affected Srbds: Not affected Tsx async abort: Mitigation; Clear CPU buffers; SMT Host state unknown
Operating System: ubuntu1~22.04
SDK version:
Failure Information (for bugs)
... FAILED: vendor/llama.cpp/examples/llava/llama-llava-cli : && /usr/bin/g++ -pthread -B /mnt/x_env/compiler_compat -O3 -DNDEBUG vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-llava-cli.dir/llava-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-llava-cli -Wl,-rpath,/tmp/tmp6bws6ysg/build/vendor/llama.cpp/src:/tmp/tmp6bws6ysg/build/vendor/llama.cpp/ggml/src: vendor/llama.cpp/common/libcommon.a vendor/llama.cpp/src/libllama.so vendor/llama.cpp/ggml/src/libggml.so && : /mnt/x_env/compiler_compat/ld: warning: libcuda.so.1, needed by vendor/llama.cpp/ggml/src/libggml.so, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: warning: libgomp.so.1, needed by vendor/llama.cpp/ggml/src/libggml.so, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: warning: libdl.so.2, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: warning: libpthread.so.0, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: warning: librt.so.1, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemCreate' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
GOMP_barrier@GOMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference tocuMemAddressReserve' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemUnmap' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference toGOMP_parallel@GOMP_4.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemSetAccess' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference tocuDeviceGet' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
omp_get_thread_num@OMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference tocuMemAddressFree' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuGetErrorString' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference toGOMP_single_start@GOMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuDeviceGetAttribute' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference tocuMemMap' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemRelease' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference toomp_get_num_threads@OMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemGetAllocationGranularity' collect2: error: ld returned 1 exit status ninja: build stopped: subcommand failed.*** CMake build failed error: subprocess-exited-with-error
× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> See above for output.
Steps to Reproduce
Nvidia Driver Version: 550.54.14 Cuda Tookit Verion: V12.4.99