Prerequisites

Please answer the following questions for yourself before submitting an issue.

[x] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
[x] I carefully followed the README.md.
[x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
[x] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Trying to install llama-cpp-python as stated by this document: https://github.com/KillianLucas/open-interpreter/blob/main/docs/MACOS.md.

Current Behavior

Getting the following error while running CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir:

 Building wheel for llama-cpp-python (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [44 lines of output]
      *** scikit-build-core 0.5.0 using CMake 3.27.4 (wheel)
      *** Configuring CMake...
      2023-09-14 11:43:30,426 - scikit_build_core - WARNING - libdir/ldlibrary: /Users/arjen/miniforge3/envs/oi/lib/libpython3.11.a is not a real file!
      2023-09-14 11:43:30,426 - scikit_build_core - WARNING - Can't find a Python library, got libdir=/Users/arjen/miniforge3/envs/oi/lib, ldlibrary=libpython3.11.a, multiarch=darwin, masd=None

Environment and Context

python3.11.4 in a conda environment from miniforge3 for arm64 support, as stated in the readme file:

"Note: If you are using Apple Silicon (M1) Mac, make sure you have installed a version of Python that supports arm64 architecture. For example:

wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
bash Miniforge3-MacOSX-arm64.sh"

Tried to find online for solutions, installed packages like gcc, xcode c++, etc. but nothing resolved the issue. The error states it requires libpython3.11.a in that location, but such a file does not exist there. However, the libpython3.11 does. If I search online for that file, I can only find .deb packages for linux containing that file. But I can't unpack such packages, even with the ar command I can't get that file to release from the package.

Physical (or virtual) hardware you are using: Macbook m1 max 10core cpu - 24core gpu - 32gb ram - 2tb disk
Operating System: MacOS Ventura 13.0:

Darwin Laptop-van-Arjen.local 22.1.0 Darwin Kernel Version 22.1.0: Sun Oct 9 20:15:09 PDT 2022; root:xnu-8792.41.9~2/RELEASE_ARM64_T6000 arm64

$ python3 --version = 3.11.4
$ make --version = GNU Make 3.81
This program built for i386-apple-darwin11.3.0
$ g++ --version = Apple clang version 14.0.0 (clang-1400.0.29.202)
Target: arm64-apple-darwin22.1.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

Error output during install:

Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [44 lines of output]
      *** scikit-build-core 0.5.0 using CMake 3.27.4 (wheel)
      *** Configuring CMake...
      2023-09-14 11:43:30,426 - scikit_build_core - WARNING - libdir/ldlibrary: /Users/arjen/miniforge3/envs/oi/lib/libpython3.11.a is not a real file!
      2023-09-14 11:43:30,426 - scikit_build_core - WARNING - Can't find a Python library, got libdir=/Users/arjen/miniforge3/envs/oi/lib, ldlibrary=libpython3.11.a, multiarch=darwin, masd=None
      loading initial cache file /var/folders/r4/9mrbh04j1_gc4h5z0m3f52d80000gn/T/tmpnhw2qb6p/build/CMakeInit.txt
      -- The C compiler identification is AppleClang 14.0.0.14000029
      -- The CXX compiler identification is AppleClang 14.0.0.14000029
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Found Git: /usr/bin/git (found version "2.37.1 (Apple Git-137.1)")
      fatal: not a git repository (or any of the parent directories): .git
      fatal: not a git repository (or any of the parent directories): .git
      CMake Warning at vendor/llama.cpp/CMakeLists.txt:125 (message):
        Git repository not found; to enable automatic generation of build info,
        make sure Git is installed and the project is a Git repository.

      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
      -- Check if compiler accepts -pthread
      -- Check if compiler accepts -pthread - no
      -- Looking for pthread_create in pthreads
      -- Looking for pthread_create in pthreads - not found
      -- Looking for pthread_create in pthread
      -- Looking for pthread_create in pthread - not found
      CMake Error at /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
        Could NOT find Threads (missing: Threads_FOUND)
      Call Stack (most recent call first):
        /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE)
        /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindThreads.cmake:226 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
        vendor/llama.cpp/CMakeLists.txt:137 (find_package)

      -- Configuring incomplete, errors occurred!

      *** CMake configuration failed
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

I noticed that the make command was still the old version. Finally managed to update the make to version 4.4.1. Then tried installing llama-cpp-python again using :CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir but no luck. headache

I'm having the same issue on macOS with versions 0.2+ the installation fails

Me too on MacOs. Following thread

I noticed the following: When I input "python --version" it outputs "Python 3.11.4" When I input "python3 --version" it outputs "Python 3.10.10" Perhaps useful information for whomever this may concern

Hey @primemp looking at tthe logs the issue related to the libpython.11.a file thing is just a warning, the actual error that is causing the build to fail is from cmake here:

      -- Looking for pthread_create in pthread - not found
      CMake Error at /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
        Could NOT find Threads (missing: Threads_FOUND)
      Call Stack (most recent call first):
        /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE)
        /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindThreads.cmake:226 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
        vendor/llama.cpp/CMakeLists.txt:137 (find_package)

Can you try installing llama.cpp with cmake and posting that log? Should help me narrow down the issue.

@icecoldt369 @remixer-dec are you getting the same Could not find Threads error?

@abetlen mine is a bit more extensive. Sorry if its a rookie mistake :( :

Building wheels for collected packages: llama-cpp-python Building wheel for llama-cpp-python (pyproject.toml) ... error error: subprocess-exited-with-error

× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [109 lines of output] scikit-build-core 0.5.0 using CMake 3.27.4 (wheel) Configuring CMake... 2023-09-14 14:42:22,443 - scikit_build_core - WARNING - libdir/ldlibrary: /Library/Frameworks/Python.framework/Versions/3.11/lib/Python.framework/Versions/3.11/Python is not a real file! loading initial cache file /var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/tmp1hvh10jk/build/CMakeInit.txt -- The C compiler identification is AppleClang 11.0.3.11030032 -- The CXX compiler identification is AppleClang 11.0.3.11030032 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/local/bin/git (found version "2.42.0") fatal: not a git repository (or any of the parent directories): .git fatal: not a git repository (or any of the parent directories): .git CMake Warning at vendor/llama.cpp/CMakeLists.txt:125 (message): Git repository not found; to enable automatic generation of build info, make sure Git is installed and the project is a Git repository.

  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
  -- Found Threads: TRUE
  -- Accelerate framework found
  -- Metal framework found
  -- CMAKE_SYSTEM_PROCESSOR: x86_64
  -- x86 detected
  CMake Warning (dev) at vendor/llama.cpp/CMakeLists.txt:676 (install):
    Target llama has RESOURCE files but no RESOURCE DESTINATION.
  This warning is for project developers.  Use -Wno-dev to suppress it.

  -- Configuring done (2.7s)
  -- Generating done (0.0s)
  CMake Warning:
    Manually-specified variables were not used by the project:

      LLAMA_OPENBLAS

  -- Build files have been written to: /var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/tmp1hvh10jk/build
  *** Building project with Ninja...
  Change Dir: '/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/tmp1hvh10jk/build'

  Run Build Command(s): /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-build-env-oxxl37ct/normal/lib/python3.11/site-packages/ninja/data/bin/ninja -v
  [1/11] /Library/Developer/CommandLineTools/usr/bin/cc -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/. -F/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks -march=native -mtune=native -O3 -DNDEBUG -std=gnu11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml-alloc.c
  [2/11] /Library/Developer/CommandLineTools/usr/bin/c++ -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/common/. -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/. -march=native -mtune=native -O3 -DNDEBUG -std=gnu++11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -MD -MT vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o -MF vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o.d -o vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/common/console.cpp
  [3/11] /Library/Developer/CommandLineTools/usr/bin/cc -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/. -F/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks -march=native -mtune=native -O3 -DNDEBUG -std=gnu11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml-metal.m
  FAILED: vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o
  /Library/Developer/CommandLineTools/usr/bin/cc -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/. -F/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks -march=native -mtune=native -O3 -DNDEBUG -std=gnu11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml-metal.m
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml-metal.m:613:5: error: use of undeclared identifier 'MTLComputePassDescriptor'
      MTLComputePassDescriptor * edesc = MTLComputePassDescriptor.computePassDescriptor;
      ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml-metal.m:613:32: error: use of undeclared identifier 'edesc'
      MTLComputePassDescriptor * edesc = MTLComputePassDescriptor.computePassDescriptor;
                                 ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml-metal.m:613:40: error: use of undeclared identifier 'MTLComputePassDescriptor'
      MTLComputePassDescriptor * edesc = MTLComputePassDescriptor.computePassDescriptor;
                                         ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml-metal.m:618:5: error: use of undeclared identifier 'edesc'
      edesc.dispatchType = has_concur ? MTLDispatchTypeConcurrent : MTLDispatchTypeSerial;
      ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml-metal.m:631:61: warning: instance method '-computeCommandEncoderWithDescriptor:' not found (return type defaults to 'id') [-Wobjc-method-access]
          ctx->command_encoders[i] = [ctx->command_buffers[i] computeCommandEncoderWithDescriptor: edesc];
                                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml-metal.m:631:98: error: use of undeclared identifier 'edesc'
          ctx->command_encoders[i] = [ctx->command_buffers[i] computeCommandEncoderWithDescriptor: edesc];
                                                                                                   ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml-metal.m:873:61: error: use of undeclared identifier 'MTLGPUFamilyApple7'
                                  [ctx->device supportsFamily:MTLGPUFamilyApple7] &&
                                                              ^
  1 warning and 6 errors generated.
  [4/11] /Library/Developer/CommandLineTools/usr/bin/cc -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/. -F/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks -march=native -mtune=native -O3 -DNDEBUG -std=gnu11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/k_quants.c
  [5/11] /Library/Developer/CommandLineTools/usr/bin/c++ -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/common/. -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/. -march=native -mtune=native -O3 -DNDEBUG -std=gnu++11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -MD -MT vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o -MF vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o.d -o vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/common/grammar-parser.cpp
  [6/11] /Library/Developer/CommandLineTools/usr/bin/c++ -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/common/. -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/. -march=native -mtune=native -O3 -DNDEBUG -std=gnu++11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -MD -MT vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o -MF vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o.d -o vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/common/common.cpp
  [7/11] /Library/Developer/CommandLineTools/usr/bin/c++ -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -DLLAMA_BUILD -DLLAMA_SHARED -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -Dllama_EXPORTS -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/. -F/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks -march=native -mtune=native -O3 -DNDEBUG -std=gnu++11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -MD -MT vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o -MF vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o.d -o vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/llama.cpp
  [8/11] /Library/Developer/CommandLineTools/usr/bin/cc -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/. -F/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks -march=native -mtune=native -O3 -DNDEBUG -std=gnu11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml.c
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml.c:2391:5: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion]
      GGML_F16_VEC_REDUCE(sumf, sum);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml.c:2023:37: note: expanded from macro 'GGML_F16_VEC_REDUCE'
  #define GGML_F16_VEC_REDUCE         GGML_F32Cx8_REDUCE
                                      ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml.c:2013:33: note: expanded from macro 'GGML_F32Cx8_REDUCE'
  #define GGML_F32Cx8_REDUCE      GGML_F32x8_REDUCE
                                  ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml.c:1959:11: note: expanded from macro 'GGML_F32x8_REDUCE'
      res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1));                     \
          ~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml.c:3657:9: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion]
          GGML_F16_VEC_REDUCE(sumf[k], sum[k]);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml.c:2023:37: note: expanded from macro 'GGML_F16_VEC_REDUCE'
  #define GGML_F16_VEC_REDUCE         GGML_F32Cx8_REDUCE
                                      ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml.c:2013:33: note: expanded from macro 'GGML_F32Cx8_REDUCE'
  #define GGML_F32Cx8_REDUCE      GGML_F32x8_REDUCE
                                  ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-n8jcpet6/llama-cpp-python_897b918c773e4c8085255b5bd8e1dbd2/vendor/llama.cpp/ggml.c:1959:11: note: expanded from macro 'GGML_F32x8_REDUCE'
      res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1));                     \
          ~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  2 warnings generated.
  ninja: build stopped: subcommand failed.

  *** CMake build failed
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for llama-cpp-python Failed to build llama-cpp-python ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

@icecoldt369 no worries, can you try installing llama.cpp with cmake and metal enabled?

@abetlen thank you for your reply. Can you give me the code which I have to input? I'm a complete noob when it comes to coding and terminal language. Cheers!

@abetlen doesn't seem to be resolving anything. I have made sure the prerequisites and dev environment have been installed prior:

× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [103 lines of output] scikit-build-core 0.5.0 using CMake 3.27.4 (wheel) Configuring CMake... 2023-09-14 16:56:07,388 - scikit_build_core - WARNING - libdir/ldlibrary: /Library/Frameworks/Python.framework/Versions/3.11/lib/Python.framework/Versions/3.11/Python is not a real file! loading initial cache file /var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/tmpspa24b39/build/CMakeInit.txt -- The C compiler identification is AppleClang 11.0.3.11030032 -- The CXX compiler identification is AppleClang 11.0.3.11030032 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/local/bin/git (found version "2.42.0") fatal: not a git repository (or any of the parent directories): .git fatal: not a git repository (or any of the parent directories): .git CMake Warning at vendor/llama.cpp/CMakeLists.txt:125 (message): Git repository not found; to enable automatic generation of build info, make sure Git is installed and the project is a Git repository.

  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
  -- Found Threads: TRUE
  -- Accelerate framework found
  -- Metal framework found
  -- CMAKE_SYSTEM_PROCESSOR: x86_64
  -- x86 detected
  CMake Warning (dev) at vendor/llama.cpp/CMakeLists.txt:676 (install):
    Target llama has RESOURCE files but no RESOURCE DESTINATION.
  This warning is for project developers.  Use -Wno-dev to suppress it.

  -- Configuring done (0.8s)
  -- Generating done (0.0s)
  -- Build files have been written to: /var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/tmpspa24b39/build
  *** Building project with Ninja...
  Change Dir: '/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/tmpspa24b39/build'

  Run Build Command(s): /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-build-env-74q_89qy/normal/lib/python3.11/site-packages/ninja/data/bin/ninja -v
  [1/11] /Library/Developer/CommandLineTools/usr/bin/cc -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/. -F/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks -march=native -mtune=native -O3 -DNDEBUG -std=gnu11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml-alloc.c
  [2/11] /Library/Developer/CommandLineTools/usr/bin/c++ -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/common/. -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/. -march=native -mtune=native -O3 -DNDEBUG -std=gnu++11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -MD -MT vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o -MF vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o.d -o vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/common/console.cpp
  [3/11] /Library/Developer/CommandLineTools/usr/bin/cc -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/. -F/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks -march=native -mtune=native -O3 -DNDEBUG -std=gnu11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml-metal.m
  FAILED: vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o
  /Library/Developer/CommandLineTools/usr/bin/cc -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/. -F/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks -march=native -mtune=native -O3 -DNDEBUG -std=gnu11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml-metal.m
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml-metal.m:613:5: error: use of undeclared identifier 'MTLComputePassDescriptor'
      MTLComputePassDescriptor * edesc = MTLComputePassDescriptor.computePassDescriptor;
      ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml-metal.m:613:32: error: use of undeclared identifier 'edesc'
      MTLComputePassDescriptor * edesc = MTLComputePassDescriptor.computePassDescriptor;
                                 ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml-metal.m:613:40: error: use of undeclared identifier 'MTLComputePassDescriptor'
      MTLComputePassDescriptor * edesc = MTLComputePassDescriptor.computePassDescriptor;
                                         ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml-metal.m:618:5: error: use of undeclared identifier 'edesc'
      edesc.dispatchType = has_concur ? MTLDispatchTypeConcurrent : MTLDispatchTypeSerial;
      ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml-metal.m:631:61: warning: instance method '-computeCommandEncoderWithDescriptor:' not found (return type defaults to 'id') [-Wobjc-method-access]
          ctx->command_encoders[i] = [ctx->command_buffers[i] computeCommandEncoderWithDescriptor: edesc];
                                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml-metal.m:631:98: error: use of undeclared identifier 'edesc'
          ctx->command_encoders[i] = [ctx->command_buffers[i] computeCommandEncoderWithDescriptor: edesc];
                                                                                                   ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml-metal.m:873:61: error: use of undeclared identifier 'MTLGPUFamilyApple7'
                                  [ctx->device supportsFamily:MTLGPUFamilyApple7] &&
                                                              ^
  1 warning and 6 errors generated.
  [4/11] /Library/Developer/CommandLineTools/usr/bin/cc -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/. -F/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks -march=native -mtune=native -O3 -DNDEBUG -std=gnu11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/k_quants.c
  [5/11] /Library/Developer/CommandLineTools/usr/bin/c++ -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/common/. -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/. -march=native -mtune=native -O3 -DNDEBUG -std=gnu++11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -MD -MT vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o -MF vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o.d -o vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/common/grammar-parser.cpp
  [6/11] /Library/Developer/CommandLineTools/usr/bin/c++ -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/common/. -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/. -march=native -mtune=native -O3 -DNDEBUG -std=gnu++11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -MD -MT vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o -MF vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o.d -o vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/common/common.cpp
  [7/11] /Library/Developer/CommandLineTools/usr/bin/c++ -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -DLLAMA_BUILD -DLLAMA_SHARED -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -Dllama_EXPORTS -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/. -F/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks -march=native -mtune=native -O3 -DNDEBUG -std=gnu++11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -MD -MT vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o -MF vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o.d -o vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/llama.cpp
  [8/11] /Library/Developer/CommandLineTools/usr/bin/cc -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/. -F/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks -march=native -mtune=native -O3 -DNDEBUG -std=gnu11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o -c /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml.c
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml.c:2391:5: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion]
      GGML_F16_VEC_REDUCE(sumf, sum);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml.c:2023:37: note: expanded from macro 'GGML_F16_VEC_REDUCE'
  #define GGML_F16_VEC_REDUCE         GGML_F32Cx8_REDUCE
                                      ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml.c:2013:33: note: expanded from macro 'GGML_F32Cx8_REDUCE'
  #define GGML_F32Cx8_REDUCE      GGML_F32x8_REDUCE
                                  ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml.c:1959:11: note: expanded from macro 'GGML_F32x8_REDUCE'
      res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1));                     \
          ~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml.c:3657:9: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion]
          GGML_F16_VEC_REDUCE(sumf[k], sum[k]);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml.c:2023:37: note: expanded from macro 'GGML_F16_VEC_REDUCE'
  #define GGML_F16_VEC_REDUCE         GGML_F32Cx8_REDUCE
                                      ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml.c:2013:33: note: expanded from macro 'GGML_F32Cx8_REDUCE'
  #define GGML_F32Cx8_REDUCE      GGML_F32x8_REDUCE
                                  ^
  /private/var/folders/4l/zwvr5hz51gvbhqkcpm0lhljc0000gn/T/pip-install-m27s3ma3/llama-cpp-python_f6f76d2d0d8746a8af06996220ab80a1/vendor/llama.cpp/ggml.c:1959:11: note: expanded from macro 'GGML_F32x8_REDUCE'
      res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1));                     \
          ~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  2 warnings generated.
  ninja: build stopped: subcommand failed.

  *** CMake build failed
  [end of output]

@primemp sure, can you run the following and let me know if it builds correctly

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
mkdir build
cd build
cmake ..
cmake --build . --config Release

@icecoldt369 so llama.cpp built without any error: use of undeclared identifier errors?

@abetlen Ok here we go, the complete output just to be sure I don't miss anything:

(oi) arjen@Laptop-van-Arjen build % cmake --build . --config Release
[  1%] Built target BUILD_INFO
[  2%] Building C object CMakeFiles/ggml.dir/ggml.c.o
[  4%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o
[  5%] Building C object CMakeFiles/ggml.dir/ggml-metal.m.o
[  7%] Building C object CMakeFiles/ggml.dir/k_quants.c.o
[  7%] Built target ggml
[  8%] Linking C static library libggml_static.a
[  8%] Built target ggml_static
[  9%] Building CXX object CMakeFiles/llama.dir/llama.cpp.o
[ 11%] Linking CXX static library libllama.a
[ 11%] Built target llama
[ 12%] Building CXX object common/CMakeFiles/common.dir/common.cpp.o
[ 14%] Building CXX object common/CMakeFiles/common.dir/console.cpp.o
[ 15%] Building CXX object common/CMakeFiles/common.dir/grammar-parser.cpp.o
[ 15%] Built target common
[ 16%] Building CXX object tests/CMakeFiles/test-quantize-fns.dir/test-quantize-fns.cpp.o
[ 18%] Linking CXX executable ../bin/test-quantize-fns
[ 18%] Built target test-quantize-fns
[ 19%] Building CXX object tests/CMakeFiles/test-quantize-perf.dir/test-quantize-perf.cpp.o
[ 21%] Linking CXX executable ../bin/test-quantize-perf
[ 21%] Built target test-quantize-perf
[ 22%] Building CXX object tests/CMakeFiles/test-sampling.dir/test-sampling.cpp.o
[ 23%] Linking CXX executable ../bin/test-sampling
[ 23%] Built target test-sampling
[ 25%] Building CXX object tests/CMakeFiles/test-tokenizer-0-llama.dir/test-tokenizer-0-llama.cpp.o
[ 26%] Linking CXX executable ../bin/test-tokenizer-0-llama
[ 26%] Built target test-tokenizer-0-llama
[ 28%] Building CXX object tests/CMakeFiles/test-tokenizer-0-falcon.dir/test-tokenizer-0-falcon.cpp.o
[ 29%] Linking CXX executable ../bin/test-tokenizer-0-falcon
[ 29%] Built target test-tokenizer-0-falcon
[ 30%] Building CXX object tests/CMakeFiles/test-tokenizer-1-llama.dir/test-tokenizer-1-llama.cpp.o
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/tests/test-tokenizer-1-llama.cpp:91:43: warning: format specifies type 'unsigned long long' but the argument has type 'std::basic_string<char>::size_type' (aka 'unsigned long') [-Wformat]
                __func__, i, str.c_str(), str.length(), check.c_str(), check.length());
                                          ^~~~~~~~~~~~
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/tests/test-tokenizer-1-llama.cpp:91:72: warning: format specifies type 'unsigned long long' but the argument has type 'std::basic_string<char>::size_type' (aka 'unsigned long') [-Wformat]
                __func__, i, str.c_str(), str.length(), check.c_str(), check.length());
                                                                       ^~~~~~~~~~~~~~
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/tests/test-tokenizer-1-llama.cpp:104:50: warning: format specifies type 'unsigned long long' but the argument has type 'std::basic_string<char>::size_type' (aka 'unsigned long') [-Wformat]
                    __func__, cp, check.c_str(), check.length(), str.c_str(), str.length());
                                                 ^~~~~~~~~~~~~~
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/tests/test-tokenizer-1-llama.cpp:104:79: warning: format specifies type 'unsigned long long' but the argument has type 'std::basic_string<char>::size_type' (aka 'unsigned long') [-Wformat]
                    __func__, cp, check.c_str(), check.length(), str.c_str(), str.length());
                                                                              ^~~~~~~~~~~~
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/tests/test-tokenizer-1-llama.cpp:116:46: warning: format specifies type 'unsigned long long' but the argument has type 'std::basic_string<char>::size_type' (aka 'unsigned long') [-Wformat]
                __func__, cp, check.c_str(), check.length(), str.c_str(), str.length());
                                             ^~~~~~~~~~~~~~
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/tests/test-tokenizer-1-llama.cpp:116:75: warning: format specifies type 'unsigned long long' but the argument has type 'std::basic_string<char>::size_type' (aka 'unsigned long') [-Wformat]
                __func__, cp, check.c_str(), check.length(), str.c_str(), str.length());
                                                                          ^~~~~~~~~~~~
6 warnings generated.
[ 32%] Linking CXX executable ../bin/test-tokenizer-1-llama
[ 32%] Built target test-tokenizer-1-llama
[ 33%] Building CXX object tests/CMakeFiles/test-grammar-parser.dir/test-grammar-parser.cpp.o
[ 35%] Linking CXX executable ../bin/test-grammar-parser
[ 35%] Built target test-grammar-parser
[ 36%] Building CXX object tests/CMakeFiles/test-llama-grammar.dir/test-llama-grammar.cpp.o
[ 38%] Linking CXX executable ../bin/test-llama-grammar
[ 38%] Built target test-llama-grammar
[ 39%] Building CXX object tests/CMakeFiles/test-grad0.dir/test-grad0.cpp.o
[ 40%] Linking CXX executable ../bin/test-grad0
[ 40%] Built target test-grad0
[ 42%] Building C object tests/CMakeFiles/test-c.dir/test-c.c.o
[ 43%] Linking CXX executable ../bin/test-c
[ 43%] Built target test-c
[ 45%] Building CXX object examples/main/CMakeFiles/main.dir/main.cpp.o
[ 46%] Linking CXX executable ../../bin/main
[ 46%] Built target main
[ 47%] Building CXX object examples/quantize/CMakeFiles/quantize.dir/quantize.cpp.o
[ 49%] Linking CXX executable ../../bin/quantize
[ 49%] Built target quantize
[ 50%] Building CXX object examples/quantize-stats/CMakeFiles/quantize-stats.dir/quantize-stats.cpp.o
[ 52%] Linking CXX executable ../../bin/quantize-stats
[ 52%] Built target quantize-stats
[ 53%] Building CXX object examples/perplexity/CMakeFiles/perplexity.dir/perplexity.cpp.o
[ 54%] Linking CXX executable ../../bin/perplexity
[ 54%] Built target perplexity
[ 56%] Building CXX object examples/embedding/CMakeFiles/embedding.dir/embedding.cpp.o
[ 57%] Linking CXX executable ../../bin/embedding
[ 57%] Built target embedding
[ 59%] Building CXX object examples/save-load-state/CMakeFiles/save-load-state.dir/save-load-state.cpp.o
[ 60%] Linking CXX executable ../../bin/save-load-state
[ 60%] Built target save-load-state
[ 61%] Building CXX object examples/benchmark/CMakeFiles/benchmark.dir/benchmark-matmult.cpp.o
[ 63%] Linking CXX executable ../../bin/benchmark
[ 63%] Built target benchmark
[ 64%] Building CXX object examples/baby-llama/CMakeFiles/baby-llama.dir/baby-llama.cpp.o
[ 66%] Linking CXX executable ../../bin/baby-llama
[ 66%] Built target baby-llama
[ 67%] Building CXX object examples/train-text-from-scratch/CMakeFiles/train-text-from-scratch.dir/train-text-from-scratch.cpp.o
In file included from /Users/arjen/miniforge3/envs/oi/lib/llama.cpp/examples/train-text-from-scratch/train-text-from-scratch.cpp:3:
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
#define die_fmt(fmt, ...) do { fprintf(stderr, "error: " fmt "\n", ##__VA_ARGS__); exit(1); } while (0)
                                                                   ^
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
75 warnings generated.
[ 69%] Linking CXX executable ../../bin/train-text-from-scratch
[ 69%] Built target train-text-from-scratch
[ 70%] Building CXX object examples/convert-llama2c-to-ggml/CMakeFiles/convert-llama2c-to-ggml.dir/convert-llama2c-to-ggml.cpp.o
In file included from /Users/arjen/miniforge3/envs/oi/lib/llama.cpp/examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp:3:
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
#define die_fmt(fmt, ...) do { fprintf(stderr, "error: " fmt "\n", ##__VA_ARGS__); exit(1); } while (0)
                                                                   ^
/Users/arjen/miniforge3/envs/oi/lib/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
2 warnings generated.
[ 71%] Linking CXX executable ../../bin/convert-llama2c-to-ggml
[ 71%] Built target convert-llama2c-to-ggml
[ 73%] Building CXX object examples/simple/CMakeFiles/simple.dir/simple.cpp.o
[ 74%] Linking CXX executable ../../bin/simple
[ 74%] Built target simple
[ 76%] Building CXX object examples/speculative/CMakeFiles/speculative.dir/speculative.cpp.o
[ 77%] Linking CXX executable ../../bin/speculative
[ 77%] Built target speculative
[ 78%] Building CXX object examples/embd-input/CMakeFiles/embdinput.dir/embd-input-lib.cpp.o
[ 80%] Linking CXX static library libembdinput.a
[ 80%] Built target embdinput
[ 81%] Building CXX object examples/embd-input/CMakeFiles/embd-input-test.dir/embd-input-test.cpp.o
[ 83%] Linking CXX executable ../../bin/embd-input-test
[ 83%] Built target embd-input-test
[ 84%] Building CXX object examples/llama-bench/CMakeFiles/llama-bench.dir/llama-bench.cpp.o
[ 85%] Linking CXX executable ../../bin/llama-bench
[ 85%] Built target llama-bench
[ 87%] Building CXX object examples/beam-search/CMakeFiles/beam-search.dir/beam-search.cpp.o
[ 88%] Linking CXX executable ../../bin/beam-search
[ 88%] Built target beam-search
[ 90%] Building CXX object examples/metal/CMakeFiles/metal.dir/metal.cpp.o
[ 91%] Linking CXX executable ../../bin/metal
[ 91%] Built target metal
[ 92%] Building CXX object examples/server/CMakeFiles/server.dir/server.cpp.o
[ 94%] Linking CXX executable ../../bin/server
[ 94%] Built target server
[ 95%] Building CXX object pocs/vdot/CMakeFiles/vdot.dir/vdot.cpp.o
[ 97%] Linking CXX executable ../../bin/vdot
[ 97%] Built target vdot
[ 98%] Building CXX object pocs/vdot/CMakeFiles/q8dot.dir/q8dot.cpp.o
[100%] Linking CXX executable ../../bin/q8dot
[100%] Built target q8dot

@abetlen If I try the same in the llama-cpp-python folder (creating that build dir, entering it and exxecute cmake ..) I'm getting this response:

CMake Error at /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
  Could NOT find Threads (missing: Threads_FOUND)
Call Stack (most recent call first):
  /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE)
  /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindThreads.cmake:226 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
  vendor/llama.cpp/CMakeLists.txt:137 (find_package)

-- Configuring incomplete, errors occurred!

@primemp can you also post the output of cmake .. from llama.cpp

@abetlen sorry im not sure i'm following. I am trying to launch llama-2 from the oobabooga_macos repo but its raising the errors I posted previously. Made sure to manually download it from the documentation and installed the dependencies but cant seem to get past through it

@abetlen Thank you for your patience with me. Here's the output of cmake ..

-- The C compiler identification is AppleClang 14.0.0.14000029
-- The CXX compiler identification is AppleClang 14.0.0.14000029
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.37.1 (Apple Git-137.1)") 
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Accelerate framework found
-- Metal framework found
-- CMAKE_SYSTEM_PROCESSOR: arm64
-- ARM detected
-- Performing Test COMPILER_SUPPORTS_FP16_FORMAT_I3E
-- Performing Test COMPILER_SUPPORTS_FP16_FORMAT_I3E - Failed
-- Configuring done (0.6s)
-- Generating done (0.1s)
-- Build files have been written to: /Users/arjen/miniforge3/envs/oi/lib/llama.cpp/build

@icecoldt369 the reason I asked to bulid llama.cpp standalone is to compare the build logs from llama.cpp and llama-cpp-python to narrow down the error. If llama.cpp fails to build then you may be running into a downstream bug.

@primemp getting somewhere, looks like the threads package is found when building with cmake from the cli but not when the package is built. Two questions

Can you run cmake --version and let me know what that is
When you pip install llama-cpp-python with --verbose is there a line like CMAKE_SYSTEM_PROCESSOR: arm64 in the output?

@abetlen ah yes i suspect this may be the case. running pip install llama-cpp or CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp ERROR: Could not find a version that satisfies the requirement llama-cpp (from versions: none) ERROR: No matching distribution found for llama-cpp

Would you be able to point me to tips or further assistance? Thanks

@icecoldt369 I think that's a typo, it should be llama-cpp-python

What I meant about building llama.cpp is following these steps and sharing the log

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
mkdir build
cd build
cmake ..
cmake --build . --config Release

@abetlen oh my bad haha. this is the output log:

git clone https://github.com/ggerganov/llama.cpp cd llama.cpp mkdir build cd build cmake .. cmake --build . --config Release Cloning into 'llama.cpp'... remote: Enumerating objects: 8862, done. remote: Counting objects: 100% (8862/8862), done. remote: Compressing objects: 100% (2696/2696), done. remote: Total 8862 (delta 6147), reused 8782 (delta 6106), pack-reused 0 Receiving objects: 100% (8862/8862), 8.29 MiB | 7.51 MiB/s, done. Resolving deltas: 100% (6147/6147), done. -- The C compiler identification is AppleClang 11.0.3.11030032 -- The CXX compiler identification is AppleClang 11.0.3.11030032 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/local/bin/git (found version "2.42.0") -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE
-- Accelerate framework found -- Metal framework found -- CMAKE_SYSTEM_PROCESSOR: x86_64 -- x86 detected -- Configuring done (1.5s) -- Generating done (0.9s) -- Build files have been written to: /Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/build [ 1%] Built target BUILD_INFO [ 2%] Building C object CMakeFiles/ggml.dir/ggml.c.o /Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml.c:2391:5: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion] GGML_F16_VEC_REDUCE(sumf, sum); ^~~~~~~~~~ /Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml.c:2023:37: note: expanded from macro 'GGML_F16_VEC_REDUCE'

define GGML_F16_VEC_REDUCE GGML_F32Cx8_REDUCE

/Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml.c:2013:33: note: expanded from macro 'GGML_F32Cx8_REDUCE'

define GGML_F32Cx8_REDUCE GGML_F32x8_REDUCE

/Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml.c:1959:11: note: expanded from macro 'GGML_F32x8_REDUCE' res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1)); \ ~ ^~~~~~~~~~ /Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml.c:3657:9: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion] GGML_F16_VEC_REDUCE(sumf[k], sum[k]); ^~~~~~~~ /Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml.c:2023:37: note: expanded from macro 'GGML_F16_VEC_REDUCE'

define GGML_F16_VEC_REDUCE GGML_F32Cx8_REDUCE

/Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml.c:2013:33: note: expanded from macro 'GGML_F32Cx8_REDUCE'

define GGML_F32Cx8_REDUCE GGML_F32x8_REDUCE

/Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml.c:1959:11: note: expanded from macro 'GGML_F32x8_REDUCE' res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1)); \ ~ ^~~~~~~~~~ 2 warnings generated. [ 4%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o [ 5%] Building C object CMakeFiles/ggml.dir/ggml-metal.m.o /Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml-metal.m:613:5: error: use of undeclared identifier 'MTLComputePassDescriptor' MTLComputePassDescriptor edesc = MTLComputePassDescriptor.computePassDescriptor; ^ /Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml-metal.m:613:32: error: use of undeclared identifier 'edesc' MTLComputePassDescriptor edesc = MTLComputePassDescriptor.computePassDescriptor; ^ /Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml-metal.m:613:40: error: use of undeclared identifier 'MTLComputePassDescriptor' MTLComputePassDescriptor * edesc = MTLComputePassDescriptor.computePassDescriptor; ^ /Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml-metal.m:618:5: error: use of undeclared identifier 'edesc' edesc.dispatchType = has_concur ? MTLDispatchTypeConcurrent : MTLDispatchTypeSerial; ^ /Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml-metal.m:631:61: warning: instance method '-computeCommandEncoderWithDescriptor:' not found (return type defaults to 'id') [-Wobjc-method-access] ctx->command_encoders[i] = [ctx->command_buffers[i] computeCommandEncoderWithDescriptor: edesc]; ^~~~~~~~~~~ /Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml-metal.m:631:98: error: use of undeclared identifier 'edesc' ctx->command_encoders[i] = [ctx->command_buffers[i] computeCommandEncoderWithDescriptor: edesc]; ^ /Users/tevykuch/V2-Langchain/oobabooga_macos/llama.cpp/ggml-metal.m:873:61: error: use of undeclared identifier 'MTLGPUFamilyApple7' [ctx->device supportsFamily:MTLGPUFamilyApple7] && ^ 1 warning and 6 errors generated. make[2]: [CMakeFiles/ggml.dir/ggml-metal.m.o] Error 1 make[1]: [CMakeFiles/ggml.dir/all] Error 2

@abetlen the cmake version = 3.27.4 The output of the pip install with `--verbose:

(oi) arjen@Laptop-van-Arjen oi % pip install llama-cpp-python --verbose
Using pip 23.2.1 from /Users/arjen/miniforge3/envs/oi/lib/python3.11/site-packages/pip (python 3.11)
Collecting llama-cpp-python
  Using cached llama_cpp_python-0.2.4.tar.gz (1.5 MB)
  Running command pip subprocess to install build dependencies
  Collecting scikit-build-core[pyproject]>=0.5.0
    Obtaining dependency information for scikit-build-core[pyproject]>=0.5.0 from https://files.pythonhosted.org/packages/94/b8/fba31e512f4e1817e3adce4fa1e2dd73dd06b7013fca9671b6b5c19a0bae/scikit_build_core-0.5.0-py3-none-any.whl.metadata
    Using cached scikit_build_core-0.5.0-py3-none-any.whl.metadata (16 kB)
  Collecting packaging>=20.9 (from scikit-build-core[pyproject]>=0.5.0)
    Using cached packaging-23.1-py3-none-any.whl (48 kB)
  Collecting pathspec>=0.10.1 (from scikit-build-core[pyproject]>=0.5.0)
    Obtaining dependency information for pathspec>=0.10.1 from https://files.pythonhosted.org/packages/b4/2a/9b1be29146139ef459188f5e420a66e835dda921208db600b7037093891f/pathspec-0.11.2-py3-none-any.whl.metadata
    Using cached pathspec-0.11.2-py3-none-any.whl.metadata (19 kB)
  Collecting pyproject-metadata>=0.5 (from scikit-build-core[pyproject]>=0.5.0)
    Using cached pyproject_metadata-0.7.1-py3-none-any.whl (7.4 kB)
  Using cached pathspec-0.11.2-py3-none-any.whl (29 kB)
  Using cached scikit_build_core-0.5.0-py3-none-any.whl (129 kB)
  Installing collected packages: pathspec, packaging, scikit-build-core, pyproject-metadata
  Successfully installed packaging-23.1 pathspec-0.11.2 pyproject-metadata-0.7.1 scikit-build-core-0.5.0
  Installing build dependencies ... done
  Running command Getting requirements to build wheel
  Getting requirements to build wheel ... done
  Running command pip subprocess to install backend dependencies
  Collecting ninja>=1.5
    Using cached ninja-1.11.1-py2.py3-none-macosx_10_9_universal2.macosx_10_9_x86_64.macosx_11_0_arm64.macosx_11_0_universal2.whl (270 kB)
  Installing collected packages: ninja
  Successfully installed ninja-1.11.1
  Installing backend dependencies ... done
  Running command Preparing metadata (pyproject.toml)
  *** scikit-build-core 0.5.0 using CMake 3.27.4 (metadata_wheel)
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in ./lib/python3.11/site-packages (from llama-cpp-python) (4.7.1)
  Link requires a different Python (3.11.4 not in: '>=3.7,<3.11'): https://files.pythonhosted.org/packages/3a/be/650f9c091ef71cb01d735775d554e068752d3ff63d7943b26316dc401749/numpy-1.21.2.zip (from https://pypi.org/simple/numpy/) (requires-python:>=3.7,<3.11)
  Link requires a different Python (3.11.4 not in: '>=3.7,<3.11'): https://files.pythonhosted.org/packages/5f/d6/ad58ded26556eaeaa8c971e08b6466f17c4ac4d786cd3d800e26ce59cc01/numpy-1.21.3.zip (from https://pypi.org/simple/numpy/) (requires-python:>=3.7,<3.11)
  Link requires a different Python (3.11.4 not in: '>=3.7,<3.11'): https://files.pythonhosted.org/packages/fb/48/b0708ebd7718a8933f0d3937513ef8ef2f4f04529f1f66ca86d873043921/numpy-1.21.4.zip (from https://pypi.org/simple/numpy/) (requires-python:>=3.7,<3.11)
  Link requires a different Python (3.11.4 not in: '>=3.7,<3.11'): https://files.pythonhosted.org/packages/c2/a8/a924a09492bdfee8c2ec3094d0a13f2799800b4fdc9c890738aeeb12c72e/numpy-1.21.5.zip (from https://pypi.org/simple/numpy/) (requires-python:>=3.7,<3.11)
  Link requires a different Python (3.11.4 not in: '>=3.7,<3.11'): https://files.pythonhosted.org/packages/45/b7/de7b8e67f2232c26af57c205aaad29fe17754f793404f59c8a730c7a191a/numpy-1.21.6.zip (from https://pypi.org/simple/numpy/) (requires-python:>=3.7,<3.11)
Collecting numpy>=1.20.0 (from llama-cpp-python)
  Obtaining dependency information for numpy>=1.20.0 from https://files.pythonhosted.org/packages/86/a1/b8ef999c32f26a97b5f714887e21f96c12ae99a38583a0a96e65283ac0a1/numpy-1.25.2-cp311-cp311-macosx_11_0_arm64.whl.metadata
  Using cached numpy-1.25.2-cp311-cp311-macosx_11_0_arm64.whl.metadata (5.6 kB)
Collecting diskcache>=5.6.1 (from llama-cpp-python)
  Obtaining dependency information for diskcache>=5.6.1 from https://files.pythonhosted.org/packages/3f/27/4570e78fc0bf5ea0ca45eb1de3818a23787af9b390c0b0a0033a1b8236f9/diskcache-5.6.3-py3-none-any.whl.metadata
  Using cached diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Using cached diskcache-5.6.3-py3-none-any.whl (45 kB)
Using cached numpy-1.25.2-cp311-cp311-macosx_11_0_arm64.whl (14.0 MB)
Building wheels for collected packages: llama-cpp-python
  Running command Building wheel for llama-cpp-python (pyproject.toml)
  *** scikit-build-core 0.5.0 using CMake 3.27.4 (wheel)
  *** Configuring CMake...
  2023-09-14 18:47:00,454 - scikit_build_core - WARNING - libdir/ldlibrary: /Users/arjen/miniforge3/envs/oi/lib/libpython3.11.a is not a real file!
  2023-09-14 18:47:00,454 - scikit_build_core - WARNING - Can't find a Python library, got libdir=/Users/arjen/miniforge3/envs/oi/lib, ldlibrary=libpython3.11.a, multiarch=darwin, masd=None
  loading initial cache file /var/folders/r4/9mrbh04j1_gc4h5z0m3f52d80000gn/T/tmpunyvz3fz/build/CMakeInit.txt
  -- The C compiler identification is AppleClang 14.0.0.14000029
  -- The CXX compiler identification is AppleClang 14.0.0.14000029
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Found Git: /usr/bin/git (found version "2.37.1 (Apple Git-137.1)")
  fatal: not a git repository (or any of the parent directories): .git
  fatal: not a git repository (or any of the parent directories): .git
  CMake Warning at vendor/llama.cpp/CMakeLists.txt:125 (message):
    Git repository not found; to enable automatic generation of build info,
    make sure Git is installed and the project is a Git repository.

  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
  -- Check if compiler accepts -pthread
  -- Check if compiler accepts -pthread - no
  -- Looking for pthread_create in pthreads
  -- Looking for pthread_create in pthreads - not found
  -- Looking for pthread_create in pthread
  -- Looking for pthread_create in pthread - not found
  CMake Error at /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
    Could NOT find Threads (missing: Threads_FOUND)
  Call Stack (most recent call first):
    /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE)
    /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindThreads.cmake:226 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
    vendor/llama.cpp/CMakeLists.txt:137 (find_package)

  -- Configuring incomplete, errors occurred!

  *** CMake configuration failed
  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /Users/arjen/miniforge3/envs/oi/bin/python3.11 /Users/arjen/miniforge3/envs/oi/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /var/folders/r4/9mrbh04j1_gc4h5z0m3f52d80000gn/T/tmpv4vp72sf
  cwd: /private/var/folders/r4/9mrbh04j1_gc4h5z0m3f52d80000gn/T/pip-install-96gos_p8/llama-cpp-python_fd21e0172d6d48b3a9cea243d92f724f
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

@icecoldt369 okay looks like your issue is actually a llama.cpp build error, do you mind opening an issue there, if it gets resolved I can merge in the fix and llama-cpp-python should work for you.

@primemp can you try pip install --verbose git+https://github.com/abetlen/llama-cpp-python.git I made a small change to the build but not sure if that will fix it for you.

@abetlen Did try that code. Output:

(oi) arjen@Laptop-van-Arjen oi % pip install --verbose git+https://github.com/abetlen/llama-cpp-python.git
Using pip 23.2.1 from /Users/arjen/miniforge3/envs/oi/lib/python3.11/site-packages/pip (python 3.11)
Collecting git+https://github.com/abetlen/llama-cpp-python.git
  Cloning https://github.com/abetlen/llama-cpp-python.git to /private/var/folders/r4/9mrbh04j1_gc4h5z0m3f52d80000gn/T/pip-req-build-gqpad37c
  Running command git version
  git version 2.37.1 (Apple Git-137.1)
  Running command git clone --filter=blob:none https://github.com/abetlen/llama-cpp-python.git /private/var/folders/r4/9mrbh04j1_gc4h5z0m3f52d80000gn/T/pip-req-build-gqpad37c
  Cloning into '/private/var/folders/r4/9mrbh04j1_gc4h5z0m3f52d80000gn/T/pip-req-build-gqpad37c'...
  Running command git rev-parse HEAD
  65a2a200506806200aaa2bc27f18e576eaed687c
  Resolved https://github.com/abetlen/llama-cpp-python.git to commit 65a2a200506806200aaa2bc27f18e576eaed687c
  Running command git submodule update --init --recursive -q
  Running command git rev-parse HEAD
  65a2a200506806200aaa2bc27f18e576eaed687c
  Running command pip subprocess to install build dependencies
  Collecting scikit-build-core[pyproject]>=0.5.0
    Obtaining dependency information for scikit-build-core[pyproject]>=0.5.0 from https://files.pythonhosted.org/packages/94/b8/fba31e512f4e1817e3adce4fa1e2dd73dd06b7013fca9671b6b5c19a0bae/scikit_build_core-0.5.0-py3-none-any.whl.metadata
    Using cached scikit_build_core-0.5.0-py3-none-any.whl.metadata (16 kB)
  Collecting packaging>=20.9 (from scikit-build-core[pyproject]>=0.5.0)
    Using cached packaging-23.1-py3-none-any.whl (48 kB)
  Collecting pathspec>=0.10.1 (from scikit-build-core[pyproject]>=0.5.0)
    Obtaining dependency information for pathspec>=0.10.1 from https://files.pythonhosted.org/packages/b4/2a/9b1be29146139ef459188f5e420a66e835dda921208db600b7037093891f/pathspec-0.11.2-py3-none-any.whl.metadata
    Using cached pathspec-0.11.2-py3-none-any.whl.metadata (19 kB)
  Collecting pyproject-metadata>=0.5 (from scikit-build-core[pyproject]>=0.5.0)
    Using cached pyproject_metadata-0.7.1-py3-none-any.whl (7.4 kB)
  Using cached pathspec-0.11.2-py3-none-any.whl (29 kB)
  Using cached scikit_build_core-0.5.0-py3-none-any.whl (129 kB)
  Installing collected packages: pathspec, packaging, scikit-build-core, pyproject-metadata
  Successfully installed packaging-23.1 pathspec-0.11.2 pyproject-metadata-0.7.1 scikit-build-core-0.5.0
  Installing build dependencies ... done
  Running command Getting requirements to build wheel
  Getting requirements to build wheel ... done
  Running command pip subprocess to install backend dependencies
  Collecting ninja>=1.5
    Using cached ninja-1.11.1-py2.py3-none-macosx_10_9_universal2.macosx_10_9_x86_64.macosx_11_0_arm64.macosx_11_0_universal2.whl (270 kB)
  Installing collected packages: ninja
  Successfully installed ninja-1.11.1
  Installing backend dependencies ... done
  Running command Preparing metadata (pyproject.toml)
  *** scikit-build-core 0.5.0 using CMake 3.27.4 (metadata_wheel)
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in ./lib/python3.11/site-packages (from llama_cpp_python==0.2.4) (4.7.1)
  Link requires a different Python (3.11.4 not in: '>=3.7,<3.11'): https://files.pythonhosted.org/packages/3a/be/650f9c091ef71cb01d735775d554e068752d3ff63d7943b26316dc401749/numpy-1.21.2.zip (from https://pypi.org/simple/numpy/) (requires-python:>=3.7,<3.11)
  Link requires a different Python (3.11.4 not in: '>=3.7,<3.11'): https://files.pythonhosted.org/packages/5f/d6/ad58ded26556eaeaa8c971e08b6466f17c4ac4d786cd3d800e26ce59cc01/numpy-1.21.3.zip (from https://pypi.org/simple/numpy/) (requires-python:>=3.7,<3.11)
  Link requires a different Python (3.11.4 not in: '>=3.7,<3.11'): https://files.pythonhosted.org/packages/fb/48/b0708ebd7718a8933f0d3937513ef8ef2f4f04529f1f66ca86d873043921/numpy-1.21.4.zip (from https://pypi.org/simple/numpy/) (requires-python:>=3.7,<3.11)
  Link requires a different Python (3.11.4 not in: '>=3.7,<3.11'): https://files.pythonhosted.org/packages/c2/a8/a924a09492bdfee8c2ec3094d0a13f2799800b4fdc9c890738aeeb12c72e/numpy-1.21.5.zip (from https://pypi.org/simple/numpy/) (requires-python:>=3.7,<3.11)
  Link requires a different Python (3.11.4 not in: '>=3.7,<3.11'): https://files.pythonhosted.org/packages/45/b7/de7b8e67f2232c26af57c205aaad29fe17754f793404f59c8a730c7a191a/numpy-1.21.6.zip (from https://pypi.org/simple/numpy/) (requires-python:>=3.7,<3.11)
Collecting numpy>=1.20.0 (from llama_cpp_python==0.2.4)
  Obtaining dependency information for numpy>=1.20.0 from https://files.pythonhosted.org/packages/86/a1/b8ef999c32f26a97b5f714887e21f96c12ae99a38583a0a96e65283ac0a1/numpy-1.25.2-cp311-cp311-macosx_11_0_arm64.whl.metadata
  Using cached numpy-1.25.2-cp311-cp311-macosx_11_0_arm64.whl.metadata (5.6 kB)
Collecting diskcache>=5.6.1 (from llama_cpp_python==0.2.4)
  Obtaining dependency information for diskcache>=5.6.1 from https://files.pythonhosted.org/packages/3f/27/4570e78fc0bf5ea0ca45eb1de3818a23787af9b390c0b0a0033a1b8236f9/diskcache-5.6.3-py3-none-any.whl.metadata
  Using cached diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Using cached diskcache-5.6.3-py3-none-any.whl (45 kB)
Using cached numpy-1.25.2-cp311-cp311-macosx_11_0_arm64.whl (14.0 MB)
Building wheels for collected packages: llama_cpp_python
  Running command Building wheel for llama_cpp_python (pyproject.toml)
  *** scikit-build-core 0.5.0 using CMake 3.27.4 (wheel)
  *** Configuring CMake...
  2023-09-14 19:23:35,972 - scikit_build_core - WARNING - libdir/ldlibrary: /Users/arjen/miniforge3/envs/oi/lib/libpython3.11.a is not a real file!
  2023-09-14 19:23:35,972 - scikit_build_core - WARNING - Can't find a Python library, got libdir=/Users/arjen/miniforge3/envs/oi/lib, ldlibrary=libpython3.11.a, multiarch=darwin, masd=None
  loading initial cache file /var/folders/r4/9mrbh04j1_gc4h5z0m3f52d80000gn/T/tmph64lixfb/build/CMakeInit.txt
  -- The C compiler identification is AppleClang 14.0.0.14000029
  -- The CXX compiler identification is AppleClang 14.0.0.14000029
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Found Git: /usr/bin/git (found version "2.37.1 (Apple Git-137.1)")
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
  -- Check if compiler accepts -pthread
  -- Check if compiler accepts -pthread - no
  -- Looking for pthread_create in pthreads
  -- Looking for pthread_create in pthreads - not found
  -- Looking for pthread_create in pthread
  -- Looking for pthread_create in pthread - not found
  CMake Error at /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
    Could NOT find Threads (missing: Threads_FOUND)
  Call Stack (most recent call first):
    /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE)
    /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindThreads.cmake:226 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
    vendor/llama.cpp/CMakeLists.txt:137 (find_package)

  -- Configuring incomplete, errors occurred!

  *** CMake configuration failed
  error: subprocess-exited-with-error

  × Building wheel for llama_cpp_python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /Users/arjen/miniforge3/envs/oi/bin/python3.11 /Users/arjen/miniforge3/envs/oi/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /var/folders/r4/9mrbh04j1_gc4h5z0m3f52d80000gn/T/tmp_1yhldej
  cwd: /private/var/folders/r4/9mrbh04j1_gc4h5z0m3f52d80000gn/T/pip-req-build-gqpad37c
  Building wheel for llama_cpp_python (pyproject.toml) ... error
  ERROR: Failed building wheel for llama_cpp_python
Failed to build llama_cpp_python
ERROR: Could not build wheels for llama_cpp_python, which is required to install pyproject.toml-based projects

Hey, thank you for the thread! I'm having this issue to so I'm including here the log for the verbose install from pip, and then the logs from a succesfull compilation of llama.cpp

Here's the failing log with that CMAKE_HAVE_LIBC_PTHREAD error:

❯ pip install --verbose git+https://github.com/abetlen/llama-cpp-python.git
Using pip 23.2.1 from /Users/gabriel/anaconda3/envs/cabildo-extract/lib/python3.11/site-packages/pip (python 3.11)
Collecting git+https://github.com/abetlen/llama-cpp-python.git
  Cloning https://github.com/abetlen/llama-cpp-python.git to /private/var/folders/ts/_0b3z_qn7sd72nfffjkydm8h0000gn/T/pip-req-build-ff7waofa
  Running command git version
  git version 2.32.1 (Apple Git-133)
  Running command git clone --filter=blob:none https://github.com/abetlen/llama-cpp-python.git /private/var/folders/ts/_0b3z_qn7sd72nfffjkydm8h0000gn/T/pip-req-build-ff7waofa
  Cloning into '/private/var/folders/ts/_0b3z_qn7sd72nfffjkydm8h0000gn/T/pip-req-build-ff7waofa'...
  Running command git rev-parse HEAD
  ca4eb952a61efe3cd9a67a7d88b112427895251b
  Resolved https://github.com/abetlen/llama-cpp-python.git to commit ca4eb952a61efe3cd9a67a7d88b112427895251b
  Running command git submodule update --init --recursive -q
  Running command git rev-parse HEAD
  ca4eb952a61efe3cd9a67a7d88b112427895251b
  Running command pip subprocess to install build dependencies
  Collecting scikit-build-core[pyproject]>=0.5.0
    Obtaining dependency information for scikit-build-core[pyproject]>=0.5.0 from https://files.pythonhosted.org/packages/94/b8/fba31e512f4e1817e3adce4fa1e2dd73dd06b7013fca9671b6b5c19a0bae/scikit_build_core-0.5.0-py3-none-any.whl.metadata
    Using cached scikit_build_core-0.5.0-py3-none-any.whl.metadata (16 kB)
  Collecting packaging>=20.9 (from scikit-build-core[pyproject]>=0.5.0)
    Using cached packaging-23.1-py3-none-any.whl (48 kB)
  Collecting pathspec>=0.10.1 (from scikit-build-core[pyproject]>=0.5.0)
    Obtaining dependency information for pathspec>=0.10.1 from https://files.pythonhosted.org/packages/b4/2a/9b1be29146139ef459188f5e420a66e835dda921208db600b7037093891f/pathspec-0.11.2-py3-none-any.whl.metadata
    Using cached pathspec-0.11.2-py3-none-any.whl.metadata (19 kB)
  Collecting pyproject-metadata>=0.5 (from scikit-build-core[pyproject]>=0.5.0)
    Using cached pyproject_metadata-0.7.1-py3-none-any.whl (7.4 kB)
  Using cached pathspec-0.11.2-py3-none-any.whl (29 kB)
  Using cached scikit_build_core-0.5.0-py3-none-any.whl (129 kB)
  Installing collected packages: pathspec, packaging, scikit-build-core, pyproject-metadata
  Successfully installed packaging-23.1 pathspec-0.11.2 pyproject-metadata-0.7.1 scikit-build-core-0.5.0
  Installing build dependencies ... done
  Running command Getting requirements to build wheel
  Getting requirements to build wheel ... done
  Running command pip subprocess to install backend dependencies
  Collecting ninja>=1.5
    Using cached ninja-1.11.1-py2.py3-none-macosx_10_9_universal2.macosx_10_9_x86_64.macosx_11_0_arm64.macosx_11_0_universal2.whl (270 kB)
  Installing collected packages: ninja
  Successfully installed ninja-1.11.1
  Installing backend dependencies ... done
  Running command Preparing metadata (pyproject.toml)
  *** scikit-build-core 0.5.0 using CMake 3.27.4 (metadata_wheel)
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in /Users/gabriel/anaconda3/envs/cabildo-extract/lib/python3.11/site-packages (from llama_cpp_python==0.2.4) (4.7.1)
Requirement already satisfied: numpy>=1.20.0 in /Users/gabriel/anaconda3/envs/cabildo-extract/lib/python3.11/site-packages (from llama_cpp_python==0.2.4) (1.25.2)
Collecting diskcache>=5.6.1 (from llama_cpp_python==0.2.4)
  Obtaining dependency information for diskcache>=5.6.1 from https://files.pythonhosted.org/packages/3f/27/4570e78fc0bf5ea0ca45eb1de3818a23787af9b390c0b0a0033a1b8236f9/diskcache-5.6.3-py3-none-any.whl.metadata
  Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.5/45.5 kB 716.4 kB/s eta 0:00:00
Building wheels for collected packages: llama_cpp_python
  Running command Building wheel for llama_cpp_python (pyproject.toml)
  *** scikit-build-core 0.5.0 using CMake 3.27.4 (wheel)
  *** Configuring CMake...
  2023-09-14 16:09:55,364 - scikit_build_core - WARNING - libdir/ldlibrary: /Users/gabriel/anaconda3/envs/cabildo-extract/lib/libpython3.11.a is not a real file!
  2023-09-14 16:09:55,364 - scikit_build_core - WARNING - Can't find a Python library, got libdir=/Users/gabriel/anaconda3/envs/cabildo-extract/lib, ldlibrary=libpython3.11.a, multiarch=darwin, masd=None
  loading initial cache file /var/folders/ts/_0b3z_qn7sd72nfffjkydm8h0000gn/T/tmpmipahvnx/build/CMakeInit.txt
  -- The C compiler identification is AppleClang 13.1.6.13160021
  -- The CXX compiler identification is AppleClang 13.1.6.13160021
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Found Git: /usr/bin/git (found version "2.32.1 (Apple Git-133)")
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
  -- Check if compiler accepts -pthread
  -- Check if compiler accepts -pthread - no
  -- Looking for pthread_create in pthreads
  -- Looking for pthread_create in pthreads - not found
  -- Looking for pthread_create in pthread
  -- Looking for pthread_create in pthread - not found
  CMake Error at /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
    Could NOT find Threads (missing: Threads_FOUND)
  Call Stack (most recent call first):
    /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE)
    /opt/homebrew/Cellar/cmake/3.27.4/share/cmake/Modules/FindThreads.cmake:226 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
    vendor/llama.cpp/CMakeLists.txt:137 (find_package)

  -- Configuring incomplete, errors occurred!

  *** CMake configuration failed
  error: subprocess-exited-with-error

  × Building wheel for llama_cpp_python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /Users/gabriel/anaconda3/envs/cabildo-extract/bin/python3.11 /Users/gabriel/anaconda3/envs/cabildo-extract/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /var/folders/ts/_0b3z_qn7sd72nfffjkydm8h0000gn/T/tmpsyf3bajc
  cwd: /private/var/folders/ts/_0b3z_qn7sd72nfffjkydm8h0000gn/T/pip-req-build-ff7waofa
  Building wheel for llama_cpp_python (pyproject.toml) ... error
  ERROR: Failed building wheel for llama_cpp_python
Failed to build llama_cpp_python
ERROR: Could not build wheels for llama_cpp_python, which is required to install pyproject.toml-based projects

And here's the compilation for llama.cpp:

 cmake ..
-- The C compiler identification is AppleClang 13.1.6.13160021
-- The CXX compiler identification is AppleClang 13.1.6.13160021
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.32.1 (Apple Git-133)") 
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Accelerate framework found
-- Metal framework found
-- CMAKE_SYSTEM_PROCESSOR: arm64
-- ARM detected
-- Performing Test COMPILER_SUPPORTS_FP16_FORMAT_I3E
-- Performing Test COMPILER_SUPPORTS_FP16_FORMAT_I3E - Failed
-- Configuring done (0.7s)
-- Generating done (0.4s)
-- Build files have been written to: /Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/build
❯ cmake --build . --config Release
[  1%] Built target BUILD_INFO
[  2%] Building C object CMakeFiles/ggml.dir/ggml.c.o
[  4%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o
[  5%] Building C object CMakeFiles/ggml.dir/ggml-metal.m.o
[  7%] Building C object CMakeFiles/ggml.dir/k_quants.c.o
[  7%] Built target ggml
[  8%] Linking C static library libggml_static.a
[  8%] Built target ggml_static
[  9%] Building CXX object CMakeFiles/llama.dir/llama.cpp.o
[ 11%] Linking CXX static library libllama.a
[ 11%] Built target llama
[ 12%] Building CXX object common/CMakeFiles/common.dir/common.cpp.o
[ 14%] Building CXX object common/CMakeFiles/common.dir/console.cpp.o
[ 15%] Building CXX object common/CMakeFiles/common.dir/grammar-parser.cpp.o
[ 15%] Built target common
[ 16%] Building CXX object tests/CMakeFiles/test-quantize-fns.dir/test-quantize-fns.cpp.o
[ 18%] Linking CXX executable ../bin/test-quantize-fns
[ 18%] Built target test-quantize-fns
[ 19%] Building CXX object tests/CMakeFiles/test-quantize-perf.dir/test-quantize-perf.cpp.o
[ 21%] Linking CXX executable ../bin/test-quantize-perf
[ 21%] Built target test-quantize-perf
[ 22%] Building CXX object tests/CMakeFiles/test-sampling.dir/test-sampling.cpp.o
[ 23%] Linking CXX executable ../bin/test-sampling
[ 23%] Built target test-sampling
[ 25%] Building CXX object tests/CMakeFiles/test-tokenizer-0-llama.dir/test-tokenizer-0-llama.cpp.o
[ 26%] Linking CXX executable ../bin/test-tokenizer-0-llama
[ 26%] Built target test-tokenizer-0-llama
[ 28%] Building CXX object tests/CMakeFiles/test-tokenizer-0-falcon.dir/test-tokenizer-0-falcon.cpp.o
[ 29%] Linking CXX executable ../bin/test-tokenizer-0-falcon
[ 29%] Built target test-tokenizer-0-falcon
[ 30%] Building CXX object tests/CMakeFiles/test-tokenizer-1-llama.dir/test-tokenizer-1-llama.cpp.o
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/tests/test-tokenizer-1-llama.cpp:91:43: warning: format specifies type 'unsigned long long' but the argument has type 'std::basic_string<char>::size_type' (aka 'unsigned long') [-Wformat]
                __func__, i, str.c_str(), str.length(), check.c_str(), check.length());
                                          ^~~~~~~~~~~~
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/tests/test-tokenizer-1-llama.cpp:91:72: warning: format specifies type 'unsigned long long' but the argument has type 'std::basic_string<char>::size_type' (aka 'unsigned long') [-Wformat]
                __func__, i, str.c_str(), str.length(), check.c_str(), check.length());
                                                                       ^~~~~~~~~~~~~~
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/tests/test-tokenizer-1-llama.cpp:104:50: warning: format specifies type 'unsigned long long' but the argument has type 'std::basic_string<char>::size_type' (aka 'unsigned long') [-Wformat]
                    __func__, cp, check.c_str(), check.length(), str.c_str(), str.length());
                                                 ^~~~~~~~~~~~~~
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/tests/test-tokenizer-1-llama.cpp:104:79: warning: format specifies type 'unsigned long long' but the argument has type 'std::basic_string<char>::size_type' (aka 'unsigned long') [-Wformat]
                    __func__, cp, check.c_str(), check.length(), str.c_str(), str.length());
                                                                              ^~~~~~~~~~~~
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/tests/test-tokenizer-1-llama.cpp:116:46: warning: format specifies type 'unsigned long long' but the argument has type 'std::basic_string<char>::size_type' (aka 'unsigned long') [-Wformat]
                __func__, cp, check.c_str(), check.length(), str.c_str(), str.length());
                                             ^~~~~~~~~~~~~~
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/tests/test-tokenizer-1-llama.cpp:116:75: warning: format specifies type 'unsigned long long' but the argument has type 'std::basic_string<char>::size_type' (aka 'unsigned long') [-Wformat]
                __func__, cp, check.c_str(), check.length(), str.c_str(), str.length());
                                                                          ^~~~~~~~~~~~
6 warnings generated.
[ 32%] Linking CXX executable ../bin/test-tokenizer-1-llama
[ 32%] Built target test-tokenizer-1-llama
[ 33%] Building CXX object tests/CMakeFiles/test-grammar-parser.dir/test-grammar-parser.cpp.o
[ 35%] Linking CXX executable ../bin/test-grammar-parser
[ 35%] Built target test-grammar-parser
[ 36%] Building CXX object tests/CMakeFiles/test-llama-grammar.dir/test-llama-grammar.cpp.o
[ 38%] Linking CXX executable ../bin/test-llama-grammar
[ 38%] Built target test-llama-grammar
[ 39%] Building CXX object tests/CMakeFiles/test-grad0.dir/test-grad0.cpp.o
[ 40%] Linking CXX executable ../bin/test-grad0
[ 40%] Built target test-grad0
[ 42%] Building C object tests/CMakeFiles/test-c.dir/test-c.c.o
[ 43%] Linking CXX executable ../bin/test-c
[ 43%] Built target test-c
[ 45%] Building CXX object examples/main/CMakeFiles/main.dir/main.cpp.o
[ 46%] Linking CXX executable ../../bin/main
[ 46%] Built target main
[ 47%] Building CXX object examples/quantize/CMakeFiles/quantize.dir/quantize.cpp.o
[ 49%] Linking CXX executable ../../bin/quantize
[ 49%] Built target quantize
[ 50%] Building CXX object examples/quantize-stats/CMakeFiles/quantize-stats.dir/quantize-stats.cpp.o
[ 52%] Linking CXX executable ../../bin/quantize-stats
[ 52%] Built target quantize-stats
[ 53%] Building CXX object examples/perplexity/CMakeFiles/perplexity.dir/perplexity.cpp.o
[ 54%] Linking CXX executable ../../bin/perplexity
[ 54%] Built target perplexity
[ 56%] Building CXX object examples/embedding/CMakeFiles/embedding.dir/embedding.cpp.o
[ 57%] Linking CXX executable ../../bin/embedding
[ 57%] Built target embedding
[ 59%] Building CXX object examples/save-load-state/CMakeFiles/save-load-state.dir/save-load-state.cpp.o
[ 60%] Linking CXX executable ../../bin/save-load-state
[ 60%] Built target save-load-state
[ 61%] Building CXX object examples/benchmark/CMakeFiles/benchmark.dir/benchmark-matmult.cpp.o
[ 63%] Linking CXX executable ../../bin/benchmark
[ 63%] Built target benchmark
[ 64%] Building CXX object examples/baby-llama/CMakeFiles/baby-llama.dir/baby-llama.cpp.o
[ 66%] Linking CXX executable ../../bin/baby-llama
[ 66%] Built target baby-llama
[ 67%] Building CXX object examples/train-text-from-scratch/CMakeFiles/train-text-from-scratch.dir/train-text-from-scratch.cpp.o
In file included from /Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/examples/train-text-from-scratch/train-text-from-scratch.cpp:3:
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
#define die_fmt(fmt, ...) do { fprintf(stderr, "error: " fmt "\n", ##__VA_ARGS__); exit(1); } while (0)
                                                                   ^
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
75 warnings generated.
[ 69%] Linking CXX executable ../../bin/train-text-from-scratch
[ 69%] Built target train-text-from-scratch
[ 70%] Building CXX object examples/convert-llama2c-to-ggml/CMakeFiles/convert-llama2c-to-ggml.dir/convert-llama2c-to-ggml.cpp.o
In file included from /Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp:3:
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
#define die_fmt(fmt, ...) do { fprintf(stderr, "error: " fmt "\n", ##__VA_ARGS__); exit(1); } while (0)
                                                                   ^
/Users/gabriel/Documents/GitHub/Repos/ggerganov/llama.cpp/common/./common.h:24:68: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
2 warnings generated.
[ 71%] Linking CXX executable ../../bin/convert-llama2c-to-ggml
[ 71%] Built target convert-llama2c-to-ggml
[ 73%] Building CXX object examples/simple/CMakeFiles/simple.dir/simple.cpp.o
[ 74%] Linking CXX executable ../../bin/simple
[ 74%] Built target simple
[ 76%] Building CXX object examples/speculative/CMakeFiles/speculative.dir/speculative.cpp.o
[ 77%] Linking CXX executable ../../bin/speculative
[ 77%] Built target speculative
[ 78%] Building CXX object examples/embd-input/CMakeFiles/embdinput.dir/embd-input-lib.cpp.o
[ 80%] Linking CXX static library libembdinput.a
[ 80%] Built target embdinput
[ 81%] Building CXX object examples/embd-input/CMakeFiles/embd-input-test.dir/embd-input-test.cpp.o
[ 83%] Linking CXX executable ../../bin/embd-input-test
[ 83%] Built target embd-input-test
[ 84%] Building CXX object examples/llama-bench/CMakeFiles/llama-bench.dir/llama-bench.cpp.o
[ 85%] Linking CXX executable ../../bin/llama-bench
[ 85%] Built target llama-bench
[ 87%] Building CXX object examples/beam-search/CMakeFiles/beam-search.dir/beam-search.cpp.o
[ 88%] Linking CXX executable ../../bin/beam-search
[ 88%] Built target beam-search
[ 90%] Building CXX object examples/metal/CMakeFiles/metal.dir/metal.cpp.o
[ 91%] Linking CXX executable ../../bin/metal
[ 91%] Built target metal
[ 92%] Building CXX object examples/server/CMakeFiles/server.dir/server.cpp.o
[ 94%] Linking CXX executable ../../bin/server
[ 94%] Built target server
[ 95%] Building CXX object pocs/vdot/CMakeFiles/vdot.dir/vdot.cpp.o
[ 97%] Linking CXX executable ../../bin/vdot
[ 97%] Built target vdot
[ 98%] Building CXX object pocs/vdot/CMakeFiles/q8dot.dir/q8dot.cpp.o
[100%] Linking CXX executable ../../bin/q8dot
[100%] Built target q8dot

@jzavala-gonzalez thanks for providing that, it looks identical to the issue @primemp is facing, don't think it's caused by llama.cpp because that compiles correctly. Can you try following the development instructions in the README to see if that works?

@abelten unfortunately i’ve that a few times, but without success.Created a new environment, even got microforge3 and created a new environment. Tried with python 3.10 and 3.11.What do you think is going wrong?There must be people out there with similar system setups, who got this piece of art running!Op 14 sep. 2023 om 23:21 heeft Andrei @.***> het volgende geschreven: @jzavala-gonzalez thanks for providing that, it looks identical to the issue @primemp is facing, don't think it's caused by llama.cpp because that compiles correctly. Can you try following the development instructions in the README to see if that works?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

@primemp I'm honestly at a bit of a loss to where the root of this issue is, obviously the cmake find_Threads package is failing but it's unclear why it succeeds inside of llama.cpp.

If you've cloned the llama-cpp-python repo, can you try removing these lines from the root CMakeLists.txt file and rebuilding

    if (APPLE)
        # Need to disable these llama.cpp flags on Apple
        # otherwise users may encounter invalid instruction errors
        set(LLAMA_AVX "Off" CACHE BOOL "llama: enable AVX" FORCE)
        set(LLAMA_AVX2 "Off" CACHE BOOL "llama: enable AVX2" FORCE)
        set(LLAMA_FMA "Off" CACHE BOOL "llama: enable FMA" FORCE)
        set(LLAMA_F16C "Off" CACHE BOOL "llama: enable F16C" FORCE)
        set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -march=native -mtune=native")
        set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=native -mtune=native")
    endif()

Shouldn't be related as it just disables some hw accelerations that were failing in ci but you never know.

I think this might be related! Following dev instructions still gives same error, but doing so after removing these lines appears to work successfully.

If you've cloned the llama-cpp-python repo, can you try removing these lines from the root CMakeLists.txt file and rebuilding

    if (APPLE)
        # Need to disable these llama.cpp flags on Apple
        # otherwise users may encounter invalid instruction errors
        set(LLAMA_AVX "Off" CACHE BOOL "llama: enable AVX" FORCE)
        set(LLAMA_AVX2 "Off" CACHE BOOL "llama: enable AVX2" FORCE)
        set(LLAMA_FMA "Off" CACHE BOOL "llama: enable FMA" FORCE)
        set(LLAMA_F16C "Off" CACHE BOOL "llama: enable F16C" FORCE)
        set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -march=native -mtune=native")
        set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=native -mtune=native")
    endif()

Shouldn't be related as it just disables some hw accelerations that were failing in ci but you never know.

Here is the log from after commenting out that whole block:

❯ pip install -e .
Obtaining file:///Users/gabriel/Documents/GitHub/Repos/abetlen/llama-cpp-python
  Installing build dependencies ... done
  Checking if build backend supports build_editable ... done
  Getting requirements to build editable ... done
  Installing backend dependencies ... done
  Preparing editable metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in /Users/gabriel/anaconda3/envs/cabildo-extract/lib/python3.11/site-packages (from llama_cpp_python==0.2.4) (4.7.1)
Requirement already satisfied: numpy>=1.20.0 in /Users/gabriel/anaconda3/envs/cabildo-extract/lib/python3.11/site-packages (from llama_cpp_python==0.2.4) (1.25.2)
Collecting diskcache>=5.6.1 (from llama_cpp_python==0.2.4)
  Obtaining dependency information for diskcache>=5.6.1 from https://files.pythonhosted.org/packages/3f/27/4570e78fc0bf5ea0ca45eb1de3818a23787af9b390c0b0a0033a1b8236f9/diskcache-5.6.3-py3-none-any.whl.metadata
  Using cached diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Using cached diskcache-5.6.3-py3-none-any.whl (45 kB)
Building wheels for collected packages: llama_cpp_python
  Building editable for llama_cpp_python (pyproject.toml) ... done
  Created wheel for llama_cpp_python: filename=llama_cpp_python-0.2.4-cp311-cp311-macosx_12_0_arm64.whl size=783349 sha256=959bdf597bb6c1927ebf098febc9d3c393069e701c5b9a5bacc998add251dd60
  Stored in directory: /private/var/folders/ts/_0b3z_qn7sd72nfffjkydm8h0000gn/T/pip-ephem-wheel-cache-_5uc4lxg/wheels/c3/7e/26/56f285bcabdedcf4b9a95a571bbfd43a52a306c01ca92fee82
Successfully built llama_cpp_python
Installing collected packages: diskcache, llama_cpp_python
Successfully installed diskcache-5.6.3 llama_cpp_python-0.2.4

Just rebuilt it 6 times and it's the last two lines (individually) that crash the build. The ones that set the C and/or CXX flags

@jzavala-gonzalez wooooh haha okay now we're getting somewhere. Thank you, for now I'll guard that block so it's not set for arm64 apple computers.

@jzavala-gonzalez @primemp pushed a fix in v0.2.5 can you test that out?

Not the OP but your fix worked on my mac and wasn't working an hour ago.

Successfully built llama-cpp-python Successfully installed llama-cpp-python-0.2.5 Generation also works fine, thanks for quick fix!

@jzavala-gonzalez @primemp pushed a fix in v0.2.5 can you test that out?

Looks to be working. Thank you!!

I'm seeing an error that I think is closer to @icecoldt369's, which I guess seems to be an upstream one. Didn't expect v0.2.5 to help and indeed it doesn't. Even though this other one is an upstream issue, is it worth also tracking here?

@uogbuji yes, could you open a new issue with the error log and a reference to the upstream issue just so I can track them separately. I think @icecoldt369 opened one up already in llama.cpp

@abetlen This worked like a charm! Ran the following code: CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -U git+https://github.com/abetlen/llama-cpp-python.git --no-cache-dir

I'm amazed by the amount of time and effort you put in, which lead to this fast solution! It's very much appreciated! Cheers

Forgot to close the issue. Thanks again!

@primemp no problem, happy to help here. I made some recent changes to the build system so I wanted crush the resulting bugs as soon as possible.

@uogbuji yes, could you open a new issue with the error log and a reference to the upstream issue just so I can track them separately. I think @icecoldt369 opened one up already in llama.cpp

Just saw this. I'll have a look.

Actually, it now builds for me, and instead I'm dealing with pydantic / fastapi DLL hell…

@abetlen I went through all the above but still get the same error both on installing llama.cpp and llama-cpp-python. I also went through the developer installation and removed the verbatim related to apple as I use macOS (not the M! one but the intel one) but still the same error. one of the main thing I guess is:

[4/13] /Library/Developer/CommandLineTools/usr/bin/cc -DACCELERATE_LAPACK_ILP64 -DACCELERATE_NEW_LAPACK -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/Users/llama-cpp-python/vendor/llama.cpp/. -F/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks -O3 -DNDEBUG -std=gnu11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -Wdouble-promotion -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wunreachable-code-break -Wunreachable-code-return -march=native -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o -c /Users/llama-cpp-python/vendor/llama.cpp/ggml-metal.m FAILED: vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o /Library/Developer/CommandLineTools/usr/bin/cc -DACCELERATE_LAPACK_ILP64 -DACCELERATE_NEW_LAPACK -DGGML_USE_ACCELERATE -DGGML_USE_K_QUANTS -DGGML_USE_METAL -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE=600 -I/Users//llama-cpp-python/vendor/llama.cpp/. -F/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks -O3 -DNDEBUG -std=gnu11 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -fPIC -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -Wdouble-promotion -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wunreachable-code-break -Wunreachable-code-return -march=native -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-metal.m.o -c /Users//llama-cpp-python/vendor/llama.cpp/ggml-metal.m /Users//llama-cpp-python/vendor/llama.cpp/ggml-metal.m:659:5: error: use of undeclared identifier 'MTLComputePassDescriptor' MTLComputePassDescriptor edesc = MTLComputePassDescriptor.computePassDescriptor; ^ /Users//llama-cpp-python/vendor/llama.cpp/ggml-metal.m:659:32: error: use of undeclared identifier 'edesc' MTLComputePassDescriptor edesc = MTLComputePassDescriptor.computePassDescriptor; ^ /Users//llama-cpp-python/vendor/llama.cpp/ggml-metal.m:659:40: error: use of undeclared identifier 'MTLComputePassDescriptor' MTLComputePassDescriptor * edesc = MTLComputePassDescriptor.computePassDescriptor; ^ /Users//llama-cpp-python/vendor/llama.cpp/ggml-metal.m:664:5: error: use of undeclared identifier 'edesc' edesc.dispatchType = has_concur ? MTLDispatchTypeConcurrent : MTLDispatchTypeSerial; ^ /Users//llama-cpp-python/vendor/llama.cpp/ggml-metal.m:677:61: warning: instance method '-computeCommandEncoderWithDescriptor:' not found (return type defaults to 'id') [-Wobjc-method-access] ctx->command_encoders[i] = [ctx->command_buffers[i] computeCommandEncoderWithDescriptor: edesc]; ^~~~~~~~~~~ /Users//llama-cpp-python/vendor/llama.cpp/ggml-metal.m:677:98: error: use of undeclared identifier 'edesc' ctx->command_encoders[i] = [ctx->command_buffers[i] computeCommandEncoderWithDescriptor: edesc]; ^ /Users//llama-cpp-python/vendor/llama.cpp/ggml-metal.m:959:61: error: use of undeclared identifier 'MTLGPUFamilyApple7' [ctx->device supportsFamily:MTLGPUFamilyApple7] && ^ 1 warning and 6 errors generated.`

I was having a similar issue as @icecoldt369

error: use of undeclared identifier 'MTLGPUFamilyApple7'
[ctx->device supportsFamily:MTLGPUFamilyApple7] &&

I was able to build and use llama.cpp normally after removing those references. I hard-coded them to be on since i know my processor is new enough to support the operations (m2 macbook air).

I dont know enough about C++ to propose a fix. But I think the issue is that for some reason running make doesn't load whatever library defines MTLGPUFamilyApple7

https://gist.github.com/dattalldood/7048a4c9ed64ba54a32e118d3ddfca47

I have the same issue as well, 2019 Mac I7

error: use of undeclared identifier 'MTLGPUFamilyApple7'
          [ctx->device supportsFamily:MTLGPUFamilyApple7] &&

I had issues and searching I found this thread, so I thought I'd post...

This repository is one that has submodules one needs to check out... So when using git, use the following command, to get everything -

git clone  --recurse-submodules  https://github.com/abetlen/llama-cpp-python.git

I was then able to install it as follows with issues resolved...

cd llama-cpp-python
pip install .

Depending on use you may want other commands, but that worked for me.

In my case, I fixed the issue by installing the CUDA Toolkit and adding CUDA_PATH to the environment variable

208 says, the 55th line of the file llama-cpp-python/llama_cpp/llama_cpp.py, which imports the dll of llama.cpp, indicates that the ctype used in the following code is due to a bug found in Windows. In fact, it is easy to see that there is a problem with the line by looking at the log during installation

return ctypes.CDLL(str(_lib_path))

The writer says that this problem is solved by modifying the code above as follows:

ctypes.CDLL(str(_lib_path),winmode=0)

However, as a result of my own experience, this alone did not solve it well. However, if you look at the rest of the codes in question, you can see that CUDA_PATH is referenced as an environmental variable when the function runs on the WIndows platform

def _load_shared_library(lib_base_name: str):
    # Construct the paths to the possible shared library names
    _base_path = pathlib.Path(os.path.abspath(os.path.dirname(__file__)))
    # Searching for the library in the current directory under the name "libllama" (default name
    # for llamacpp) and "llama" (default name for this repo)
    _lib_paths: List[pathlib.Path] = []
    # Determine the file extension based on the platform
    if sys.platform.startswith("linux"):
        _lib_paths += [
            _base_path / f"lib{lib_base_name}.so",
        ]
    elif sys.platform == "darwin":
        _lib_paths += [
            _base_path / f"lib{lib_base_name}.so",
            _base_path / f"lib{lib_base_name}.dylib",
        ]
    elif sys.platform == "win32":
        _lib_paths += [
            _base_path / f"{lib_base_name}.dll",
        ]
    else:
        raise RuntimeError("Unsupported platform")

    if "LLAMA_CPP_LIB" in os.environ:
        lib_base_name = os.environ["LLAMA_CPP_LIB"]
        _lib = pathlib.Path(lib_base_name)
        _base_path = _lib.parent.resolve()
        _lib_paths = [_lib.resolve()]

    cdll_args = dict()  # type: ignore
    # Add the library directory to the DLL search path on Windows (if needed)
    if sys.platform == "win32" and sys.version_info >= (3, 8):
        os.add_dll_directory(str(_base_path))
        if "CUDA_PATH" in os.environ:
            os.add_dll_directory(os.path.join(os.environ["CUDA_PATH"], "bin"))
            os.add_dll_directory(os.path.join(os.environ["CUDA_PATH"], "lib"))
        cdll_args["winmode"] = ctypes.RTLD_GLOBAL

    # Try to load the shared library, handling potential errors
    for _lib_path in _lib_paths:
        if _lib_path.exists():
            try:
                return ctypes.CDLL(str(_lib_path), **cdll_args)
            except Exception as e:
                raise RuntimeError(f"Failed to load shared library '{_lib_path}': {e}")

    raise FileNotFoundError(
        f"Shared library with base name '{lib_base_name}' not found"
    )

So I installed the CUDA Toolkit to add CUDA_PATH normally to the environment variable

https://developer.nvidia.com/cuda-downloads

After doing this, llama-cpp-python was installed normally in my case. You can install it through the whl file on Releases

abetlen / llama-cpp-python

Can't install llama-cpp-python -libpython3.11.a file not found during building wheel (pyproject.toml) #714

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

define GGML_F16_VEC_REDUCE GGML_F32Cx8_REDUCE

define GGML_F32Cx8_REDUCE GGML_F32x8_REDUCE

define GGML_F16_VEC_REDUCE GGML_F32Cx8_REDUCE

define GGML_F32Cx8_REDUCE GGML_F32x8_REDUCE