abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
8.2k stars 978 forks source link

Wheel build fails building version 0.2.86 #1662

Closed SleepyYui closed 3 months ago

SleepyYui commented 3 months ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

Expected Behavior

Wheel in version 0.2.86 builds properly, just like 0.2.85 and all other versions before

Current Behavior

Wheel doesn't build and fails

Environment and Context

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  8
  On-line CPU(s) list:   0-7
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
    CPU family:          6
    Model:               142
    Thread(s) per core:  2
    Core(s) per socket:  4
    Socket(s):           1
    Stepping:            10
    BogoMIPS:            3984.00
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid pni pclmulqdq 
                         vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi ept vpid ept_ad fsgsbase bmi
                         1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves md_clear flush_l1d arch_capabilities
Virtualization features:
  Virtualization:        VT-x
  Hypervisor vendor:     Microsoft
  Virtualization type:   full
Caches (sum of all):
  L1d:                   128 KiB (4 instances)
  L1i:                   128 KiB (4 instances)
  L2:                    1 MiB (4 instances)
  L3:                    8 MiB (1 instance)
Vulnerabilities:
  Gather data sampling:  Unknown: Dependent on hypervisor status
  Itlb multihit:         KVM: Mitigation: VMX disabled
  L1tf:                  Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
  Mds:                   Mitigation; Clear CPU buffers; SMT Host state unknown
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Mitigation; Clear CPU buffers; SMT Host state unknown
  Retbleed:              Mitigation; IBRS
  Spec rstack overflow:  Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl and seccomp
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Unknown: Dependent on hypervisor status
  Tsx async abort:       Not affected

5.15.153.1-microsoft-standard-WSL2 #1 SMP Fri Mar 29 23:14:13 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

$ python3 --version
Python 3.10.12
$ make --version
GNU Make 4.3
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
$ g++ --version
g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Failure Information (for bugs)

Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.

Steps to Reproduce

Run:

GGML_CCACHE=OFF CMAKE_ARGS="-DLLAMA_CUBLAS=off" FORCE_CMAKE=1 pip install  llama-cpp-python --no-cache-dir

Failure Logs

Collecting llama-cpp-python
  Downloading llama_cpp_python-0.2.86.tar.gz (49.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.3/49.3 MB 23.6 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting typing-extensions>=4.5.0
  Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Collecting diskcache>=5.6.1
  Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.5/45.5 KB 258.3 MB/s eta 0:00:00
Collecting numpy>=1.20.0
  Downloading numpy-2.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.5/19.5 MB 23.6 MB/s eta 0:00:00
Collecting jinja2>=2.11.3
  Downloading jinja2-3.1.4-py3-none-any.whl (133 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.3/133.3 KB 48.4 MB/s eta 0:00:00
Collecting MarkupSafe>=2.0
  Downloading MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [56 lines of output]
      *** scikit-build-core 0.10.1 using CMake 3.22.1 (wheel)
      *** Configuring CMake...
      loading initial cache file /tmp/tmphykwn8um/build/CMakeInit.txt
      -- The C compiler identification is GNU 11.4.0
      -- The CXX compiler identification is GNU 11.4.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: /usr/bin/x86_64-linux-gnu-gcc - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /usr/bin/x86_64-linux-gnu-g++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Found Git: /usr/bin/git (found version "2.34.1")
      -- Looking for pthread.h
      -- Looking for pthread.h - found
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
      -- Found Threads: TRUE
      -- Found OpenMP_C: -fopenmp (found version "4.5")
      -- Found OpenMP_CXX: -fopenmp (found version "4.5")
      -- Found OpenMP: TRUE (found version "4.5")
      -- OpenMP found
      -- Using llamafile
      -- ccache found, compilation results will be cached. Disable with GGML_CCACHE=OFF.
      -- CMAKE_SYSTEM_PROCESSOR: x86_64
      -- x86 detected
      INSTALL TARGETS - target llama has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
      INSTALL TARGETS - target llama has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
      INSTALL TARGETS - target ggml has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
      INSTALL TARGETS - target ggml has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
      -- Configuring done
      -- Generating done
      -- Build files have been written to: /tmp/tmphykwn8um/build
      *** Building project with Ninja...
      [1/30] ccache /usr/bin/x86_64-linux-gnu-gcc -DGGML_BUILD -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_USE_LLAMAFILE -DGGML_USE_OPENMP -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_EXPORTS -I/tmp/pip-install-sviqz_ec/llama-cpp
-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/../include -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/. -O3 -DNDEBUG -fPIC -Wshadow -Wstrict-prototy
pes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wdouble-promotion -march=native -fopenmp -std=gnu11 -MD -MT vendor/ll
ama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-alloc.c.o -MF vendor/llama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-alloc.c.o.d -o vendor/llama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-alloc.c.o -c /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/ggml-alloc.c
      [2/30] ccache /usr/bin/x86_64-linux-gnu-gcc -DGGML_BUILD -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_USE_LLAMAFILE -DGGML_USE_OPENMP -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_EXPORTS -I/tmp/pip-install-sviqz_ec/llama-cpp
-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/../include -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/. -O3 -DNDEBUG -fPIC -Wshadow -Wstrict-prototy
pes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wdouble-promotion -march=native -fopenmp -std=gnu11 -MD -MT vendor/ll
ama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-aarch64.c.o -MF vendor/llama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-aarch64.c.o.d -o vendor/llama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-aarch64.c.o -c /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/ggml-aarch64.c
      [3/30] ccache /usr/bin/x86_64-linux-gnu-gcc -DGGML_BUILD -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_USE_LLAMAFILE -DGGML_USE_OPENMP -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_EXPORTS -I/tmp/pip-install-sviqz_ec/llama-cpp
-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/../include -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/. -O3 -DNDEBUG -fPIC -Wshadow -Wstrict-prototy
pes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wdouble-promotion -march=native -fopenmp -std=gnu11 -MD -MT vendor/ll
ama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-backend.c.o -MF vendor/llama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-backend.c.o.d -o vendor/llama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-backend.c.o -c /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/ggml-backend.c
      [4/30] cd /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp && /usr/bin/cmake -DMSVC= -DCMAKE_C_COMPILER_VERSION=11.4.0 -DCMAKE_C_COMPILER_ID=GNU -DCMAKE_VS_PLATFORM_NAME= -DCMAKE_C_COMPILER=/usr/bin/x86_64-linux-gnu-gcc -P /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/common/cmake/build-info-gen-cpp.cmake
      FAILED: /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/common/build-info.cpp
      cd /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp && /usr/bin/cmake -DMSVC= -DCMAKE_C_COMPILER_VERSION=11.4.0 -DCMAKE_C_COMPILER_ID=GNU -DCMAKE_VS_PLATFORM_NAME= -DCMAKE_C_COMPILER=/usr/bin/x86_64-linux-gnu-gcc -P /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/common/cmake/build-info-gen-cpp.cmake
      CMake Error: Error processing file: /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/common/cmake/build-info-gen-cpp.cmake
      [5/30] ccache /usr/bin/x86_64-linux-gnu-g++ -DLLAMA_BUILD -DLLAMA_SHARED -Dllama_EXPORTS -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/. -I/tmp/pip-install-sviqz_ec/llam
a-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/../include -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/../include -O3 -DNDEBUG -fPIC -MD -MT vendor/l
lama.cpp/src/CMakeFiles/llama.dir/llama-grammar.cpp.o -MF vendor/llama.cpp/src/CMakeFiles/llama.dir/llama-grammar.cpp.o.d -o vendor/llama.cpp/src/CMakeFiles/llama.dir/llama-grammar.cpp.o -c /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/llama-grammar.cpp
      [6/30] ccache /usr/bin/x86_64-linux-gnu-g++ -DLLAMA_BUILD -DLLAMA_SHARED -Dllama_EXPORTS -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/. -I/tmp/pip-install-sviqz_ec/llam
a-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/../include -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/../include -O3 -DNDEBUG -fPIC -MD -MT vendor/l
lama.cpp/src/CMakeFiles/llama.dir/llama-sampling.cpp.o -MF vendor/llama.cpp/src/CMakeFiles/llama.dir/llama-sampling.cpp.o.d -o vendor/llama.cpp/src/CMakeFiles/llama.dir/llama-sampling.cpp.o -c /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/llama-sampling.cpp
      [7/30] ccache /usr/bin/x86_64-linux-gnu-g++ -DGGML_BUILD -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_USE_LLAMAFILE -DGGML_USE_OPENMP -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_EXPORTS -I/tmp/pip-install-sviqz_ec/llama-cpp
-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/../include -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/. -O3 -DNDEBUG -fPIC -Wmissing-declarations -W
missing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wno-format-truncation -Wextra-semi -march=native -fopenmp -std=gnu++11 -MD -MT vendor/llama.cpp/ggml/src/CMakeFiles/ggml.dir/llamafile
/sgemm.cpp.o -MF vendor/llama.cpp/ggml/src/CMakeFiles/ggml.dir/llamafile/sgemm.cpp.o.d -o vendor/llama.cpp/ggml/src/CMakeFiles/ggml.dir/llamafile/sgemm.cpp.o -c /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/llamafile/sgemm.cpp
      [8/30] ccache /usr/bin/x86_64-linux-gnu-g++ -DLLAMA_BUILD -DLLAMA_SHARED -Dllama_EXPORTS -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/. -I/tmp/pip-install-sviqz_ec/llam
a-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/../include -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/../include -O3 -DNDEBUG -fPIC -MD -MT vendor/l
lama.cpp/src/CMakeFiles/llama.dir/llama-vocab.cpp.o -MF vendor/llama.cpp/src/CMakeFiles/llama.dir/llama-vocab.cpp.o.d -o vendor/llama.cpp/src/CMakeFiles/llama.dir/llama-vocab.cpp.o -c /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/llama-vocab.cpp
      [9/30] ccache /usr/bin/x86_64-linux-gnu-gcc -DGGML_BUILD -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_USE_LLAMAFILE -DGGML_USE_OPENMP -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_EXPORTS -I/tmp/pip-install-sviqz_ec/llama-cpp
-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/../include -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/. -O3 -DNDEBUG -fPIC -Wshadow -Wstrict-prototy
pes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wdouble-promotion -march=native -fopenmp -std=gnu11 -MD -MT vendor/ll
ama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-quants.c.o -MF vendor/llama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-quants.c.o.d -o vendor/llama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-quants.c.o -c /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/ggml-quants.c
      [10/30] ccache /usr/bin/x86_64-linux-gnu-gcc -DGGML_BUILD -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_USE_LLAMAFILE -DGGML_USE_OPENMP -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_EXPORTS -I/tmp/pip-install-sviqz_ec/llama-cp
p-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/../include -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/. -O3 -DNDEBUG -fPIC -Wshadow -Wstrict-protot
ypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wdouble-promotion -march=native -fopenmp -std=gnu11 -MD -MT vendor/l
lama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml.c.o -MF vendor/llama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml.c.o.d -o vendor/llama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml.c.o -c /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/ggml.c
      [11/30] ccache /usr/bin/x86_64-linux-gnu-g++ -DLLAMA_BUILD -DLLAMA_SHARED -Dllama_EXPORTS -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/. -I/tmp/pip-install-sviqz_ec/lla
ma-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/../include -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/../include -O3 -DNDEBUG -fPIC -MD -MT vendor/
llama.cpp/src/CMakeFiles/llama.dir/unicode.cpp.o -MF vendor/llama.cpp/src/CMakeFiles/llama.dir/unicode.cpp.o.d -o vendor/llama.cpp/src/CMakeFiles/llama.dir/unicode.cpp.o -c /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/unicode.cpp
      [12/30] ccache /usr/bin/x86_64-linux-gnu-g++ -DLLAMA_BUILD -DLLAMA_SHARED -Dllama_EXPORTS -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/. -I/tmp/pip-install-sviqz_ec/lla
ma-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/../include -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/../include -O3 -DNDEBUG -fPIC -MD -MT vendor/
llama.cpp/src/CMakeFiles/llama.dir/unicode-data.cpp.o -MF vendor/llama.cpp/src/CMakeFiles/llama.dir/unicode-data.cpp.o.d -o vendor/llama.cpp/src/CMakeFiles/llama.dir/unicode-data.cpp.o -c /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/unicode-data.cpp
      [13/30] ccache /usr/bin/x86_64-linux-gnu-g++ -DLLAMA_BUILD -DLLAMA_SHARED -Dllama_EXPORTS -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/. -I/tmp/pip-install-sviqz_ec/lla
ma-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/../include -I/tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/ggml/src/../include -O3 -DNDEBUG -fPIC -MD -MT vendor/
llama.cpp/src/CMakeFiles/llama.dir/llama.cpp.o -MF vendor/llama.cpp/src/CMakeFiles/llama.dir/llama.cpp.o.d -o vendor/llama.cpp/src/CMakeFiles/llama.dir/llama.cpp.o -c /tmp/pip-install-sviqz_ec/llama-cpp-python_53056a2aa3cc45fabd3f645b0c2c9593/vendor/llama.cpp/src/llama.cpp
      ninja: build stopped: subcommand failed.

      *** CMake build failed
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects
peytoncai commented 3 months ago

same

HansvanHespen commented 3 months ago

I also have the problem that the build fails:

519.5   [93/93] : && /usr/bin/x86_64-linux-gnu-g++ -march=znver2 -std=gnu++20 -O3 -DNDEBUG  vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-llava-cli.dir/llava-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-llava-cli  -Wl,-rpath,/tmp/tmp56qdxi6v/build/vendor/llama.cpp/src:/tmp/tmp56qdxi6v/build/vendor/llama.cpp/ggml/src:  vendor/llama.cpp/common/libcommon.a  vendor/llama.cpp/src/libllama.so  vendor/llama.cpp/ggml/src/libggml.so && :
519.5   **FAILED**: vendor/llama.cpp/examples/llava/llama-llava-cli
519.5   : && /usr/bin/x86_64-linux-gnu-g++ -march=znver2 -std=gnu++20 -O3 -DNDEBUG  vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-llava-cli.dir/llava-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-llava-cli  -Wl,-rpath,/tmp/tmp56qdxi6v/build/vendor/llama.cpp/src:/tmp/tmp56qdxi6v/build/vendor/llama.cpp/ggml/src:  vendor/llama.cpp/common/libcommon.a  vendor/llama.cpp/src/libllama.so  vendor/llama.cpp/ggml/src/libggml.so && :
519.5   /usr/bin/ld: warning: libcuda.so.1, needed by vendor/llama.cpp/ggml/src/libggml.so, not found (try using -rpath or -rpath-link)
519.5   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemCreate'
519.5   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemAddressReserve'
519.5   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemUnmap'
519.5   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemSetAccess'
519.5   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuDeviceGet'
519.5   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemAddressFree'
519.5   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuGetErrorString'
519.5   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuDeviceGetAttribute'
519.5   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemMap'
519.5   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemRelease'
519.5   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemGetAllocationGranularity'
519.5   collect2: error: ld returned 1 exit status
519.5   ninja: build stopped: subcommand failed.
519.6
519.6   *** CMake build failed
519.6   error: subprocess-exited-with-error"

Working with the pre-built wheels is a bit unclear. I noticed there are three .so files in it, meaning three pre-compiled files. I would like to know a bit more how they are compiled. I have to cross-compile: compiling on my machine (i7 U8550), but for another machine on the cloud (AMD Epyc 7series). Hence the reason why I compile myself, with some very detailed options. It was difficult to configure CMake from the outside to make that work.

This is from my Dockerfile how I compile:

ENV FORCE_CMAKE=1

ENV CMAKE_ARGS="-DCMAKE_CUDA_COMPILER='/usr/local/cuda-12.2/bin/nvcc' -DCMAKE_CUDA_ARCHITECTURES='80' -DGGML_CUDA=ON -DGGML_NATIVE=OFF -DGGML_ALL_WARNINGS=OFF -DGGML_ACCELERATE=OFF"

ENV CXXFLAGS="-march=znver2 -std=gnu++20"

RUN pip install llama-cpp-python==0.2.86 --no-cache-dir --force-reinstall --upgrade --verbose

Version 0.2.33 compiles perfectly, but not version 0.2.86, which I need because 0.2.33 is not thread-safe and gives errors on my A100 40GB GPU.

If I compile without options, it compiles natively towards my i7 processor, and that gives a SIGINT on the Linux AMD Epyc machine in the cloud. (devel-Ubuntu 22.04)

Are the pre-compiled wheel files compiled using no processor features at all?

Anyway, compiling locally will lead to the most optimal executable, and it is a pity that it doesn't compile after all, in the latest step: 90/90. It seems to look for a libcuda.so.1 that it can't find to fuel libggml.so. Someone an idea?

High regards,

Hans

abetlen commented 3 months ago

@SleepyYui thank you for reporting, it seems the issue is the the vendor/llama.cpp/common/cmake folder is not being included in the pypi wheel?

Not sure at all why this would be, I'll look into this.

abetlen commented 3 months ago

@SleepyYui should be fixed now in 0.2.87

I've included the entire vendor/llama.cpp subdirectory in the source distribution as a workaround.

HansvanHespen commented 3 months ago

Dear @abetlen, as soon as the regular version 87 is out then, I will check as well.

gabrielmbmb commented 3 months ago

For some reason uv pip install llama-cpp-python==0.2.87 is failing too (it works for uv pip install llama-cpp-python==0.2.85):

uv pip install -U llama-cpp-python==0.2.87                                                                                                                                                                                                               
⠧ llama-cpp-python==0.2.87                                                                                                                                                                                                                                  error: Failed to download and build `llama-cpp-python==0.2.87`
  Caused by: Failed to extract archive
  Caused by: failed to unpack `/Users/gabrielmbmb/Library/Caches/uv/built-wheels-v3/.tmpcTjUBT/llama_cpp_python-0.2.87/vendor/llama.cpp/spm-headers/ggml-alloc.h`
  Caused by: File exists (os error 17) when symlinking ../ggml/include/ggml-alloc.h to /Users/gabrielmbmb/Library/Caches/uv/built-wheels-v3/.tmpcTjUBT/llama_cpp_python-0.2.87/vendor/llama.cpp/spm-headers/ggml-alloc.h

I tried cleaning the cache (uv clean) but doesn't work. It works with pip install llama-cpp-python==0.2.87, so I think this could be a bug on uv side.

abetlen commented 3 months ago

@gabrielmbmb thank you for reporting this but I think that's a seperate issue specific to uv (I've opened a new issue for it #1670), a workaround is just to pip install which seems to handle the symlinks correctly.

I'm going to close this for now as it seems resolved.

HansvanHespen commented 3 months ago

Dear @abetlen and @gabrielmbmb,

Unfortunately, I still have the same error on 0.2.87 as well:

505.2   [93/93] : && /usr/bin/x86_64-linux-gnu-g++ -march=znver4 -std=gnu++23 -O3 -DNDEBUG  vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-llava-cli.dir/llava-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-llava-cli  -Wl,-rpath,/tmp/tmpaeyk5c4_/build/vendor/llama.cpp/src:/tmp/tmpaeyk5c4_/build/vendor/llama.cpp/ggml/src:  vendor/llama.cpp/common/libcommon.a  vendor/llama.cpp/src/libllama.so  vendor/llama.cpp/ggml/src/libggml.so && :
505.2   **FAILED**: vendor/llama.cpp/examples/llava/llama-llava-cli
505.2   : && /usr/bin/x86_64-linux-gnu-g++ -march=znver4 -std=gnu++23 -O3 -DNDEBUG  vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-llava-cli.dir/llava-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-llava-cli  -Wl,-rpath,/tmp/tmpaeyk5c4_/build/vendor/llama.cpp/src:/tmp/tmpaeyk5c4_/build/vendor/llama.cpp/ggml/src:  vendor/llama.cpp/common/libcommon.a  vendor/llama.cpp/src/libllama.so  vendor/llama.cpp/ggml/src/libggml.so && :
505.2   /usr/bin/ld: warning: libcuda.so.1, needed by vendor/llama.cpp/ggml/src/libggml.so, not found (try using -rpath or -rpath-link)
505.2   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemCreate'
505.2   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemAddressReserve'
505.2   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemUnmap'
505.2   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemSetAccess'
505.2   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuDeviceGet'
505.2   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemAddressFree'
505.2   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuGetErrorString'
505.2   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuDeviceGetAttribute'
505.2   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemMap'
505.2   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemRelease'
505.2   /usr/bin/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to `cuMemGetAllocationGranularity'
505.2   collect2: error: ld returned 1 exit status
505.2   ninja: build stopped: subcommand failed. 

Still looking for the libcuda.so.1 file. And why it needs files in the example directory?

Not only I have waited for version 87, but also, I installed the newest version of devel-ubuntu 24.04, upgraded python 3.10 to 3.12, upgraded cudnn8 to 9.3 and installed cuda 12.5.1 from cuda 12.2. Nothing helps.