abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
8.1k stars 964 forks source link

Installation via pip fails for ROCm / AMD cards #646

Open Francesco215 opened 1 year ago

Francesco215 commented 1 year ago

Expected Behavior

I have a machine with and AMD GPU (Radeon RX 7900 XT). I tried to install this library as written in the README by running

CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python

Current Behavior

The installation fails, however when I simply run pip install llama-cpp-python it works

Environment and Context

To make the issue reproducible i made a Docker conatiner with this Dockerfile (adapted from the llama-cpp repo)

ARG UBUNTU_VERSION=22.04

# This needs to generally match the container host's environment.
ARG ROCM_VERSION=5.6however when I simply run pip install llama-cpp-python it works

# Target the CUDA build image
ARG BASE_ROCM_DEV_CONTAINER=rocm/dev-ubuntu-${UBUNTU_VERSION}:${ROCM_VERSION}-complete

FROM ${BASE_ROCM_DEV_CONTAINER} as build

# Unless otherwise specified, we make a fat build.
# List from https://github.com/ggerganov/llama.cpp/pull/1087#issuecomment-1682807878
# This is mostly tied to rocBLAS supported archs.
ARG ROCM_DOCKER_ARCH=\
    gfx803 \
    gfx900 \
    gfx906 \
    gfx908 \
    gfx90a \
    gfx1010 \
    gfx1030 \
    gfx1100 \ # this is my rocm arch
    gfx1101 \
    gfx1102

# Set nvcc architecture
ENV GPU_TARGETS=${ROCM_DOCKER_ARCH}
# Enable ROCm
ENV CC=/opt/rocm/llvm/bin/clang
ENV CXX=/opt/rocm/llvm/bin/clang++
ENV LLAMA_HIPBLAS=on

RUN apt-get update && apt-get -y install cmake protobuf-compiler aria2 git

System Info:

CPU: 13th Gen Intel(R) Core(TM) i5-13400F GPU: Radeon RX 7900 XT

Ubuntu 22.04.1

Python 3.10.6 Make 4.3 g++ 11.3.0

Failure Information (for bugs)

The installation failed, here is the output when running CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python

root@8bebff5da3f1:/# CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
Collecting llama-cpp-python
  Downloading llama_cpp_python-0.1.82.tar.gz (1.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 3.0 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (4.7.1)
Requirement already satisfied: numpy>=1.20.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (1.25.2)
Collecting diskcache>=5.6.1
  Downloading diskcache-5.6.1-py3-none-any.whl (45 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.6/45.6 KB 7.1 MB/s eta 0:00:00
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [390 lines of output]

      --------------------------------------------------------------------------------
      -- Trying 'Ninja' generator
      --------------------------------
      ---------------------------
      ----------------------
      -----------------
      ------------
      -------
      --
      CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
        Compatibility with CMake < 3.5 will be removed from a future version of
        CMake.

        Update the VERSION argument <min> value or use a ...<max> suffix to tell
        CMake that the project does not need compatibility with older versions.

      Not searching for unused variables given on the command line.

      -- The C compiler identification is Clang 16.0.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- The CXX compiler identification is Clang 16.0.0
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Configuring done (0.6s)
      -- Generating done (0.0s)
      -- Build files have been written to: /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_cmake_test_compile/build
      --
      -------
      ------------
      -----------------
      ----------------------
      ---------------------------
      --------------------------------
      -- Trying 'Ninja' generator - success
      --------------------------------------------------------------------------------

      Configuring Project
        Working directory:
          /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-build
        Command:
          /tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888 -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.6 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython_EXECUTABLE:PATH=/usr/bin/python3 -DPython_ROOT_DIR:PATH=/usr -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython3_EXECUTABLE:PATH=/usr/bin/python3 -DPython3_ROOT_DIR:PATH=/usr -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja -DLLAMA_HIPBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_HIPBLAS=on

      Not searching for unused variables given on the command line.
      -- The C compiler identification is Clang 16.0.0
      -- The CXX compiler identification is Clang 16.0.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Found Git: /usr/bin/git (found version "2.34.1")
      fatal: not a git repository (or any of the parent directories): .git
      fatal: not a git repository (or any of the parent directories): .git
      CMake Warning at vendor/llama.cpp/CMakeLists.txt:118 (message):
        Git repository not found; to enable automatic generation of build info,
        make sure Git is installed and the project is a Git repository.

      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
      -- Found Threads: TRUE
      CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
        Compatibility with CMake < 3.5 will be removed from a future version of
        CMake.

        Update the VERSION argument <min> value or use a ...<max> suffix to tell
        CMake that the project does not need compatibility with older versions.
      Call Stack (most recent call first):
        vendor/llama.cpp/CMakeLists.txt:366 (find_package)

      -- hip::amdhip64 is SHARED_LIBRARY
      -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
      -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success
      CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
        Compatibility with CMake < 3.5 will be removed from a future version of
        CMake.

        Update the VERSION argument <min> value or use a ...<max> suffix to tell
        CMake that the project does not need compatibility with older versions.
      Call Stack (most recent call first):
        /tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.27/Modules/CMakeFindDependencyMacro.cmake:76 (find_package)
        /opt/rocm/lib/cmake/hipblas/hipblas-config.cmake:90 (find_dependency)
        vendor/llama.cpp/CMakeLists.txt:367 (find_package)

      -- hip::amdhip64 is SHARED_LIBRARY
      -- HIP and hipBLAS found
      -- CMAKE_SYSTEM_PROCESSOR: x86_64
      -- x86 detected
      -- Configuring done (0.6s)
      -- Generating done (0.0s)
      -- Build files have been written to: /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-build
      [1/12] Building C object vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o
      [2/12] Building CXX object vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o
      [3/12] Building CXX object vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o
      [4/12] Building C object vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/k_quants.c:186:11: warning: variable 'sum_x' set but not used [-Wunused-but-set-variable]
          float sum_x = 0;
                ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/k_quants.c:187:11: warning: variable 'sum_x2' set but not used [-Wunused-but-set-variable]
          float sum_x2 = 0;
                ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/k_quants.c:182:14: warning: unused function 'make_qkx1_quants' [-Wunused-function]
      static float make_qkx1_quants(int n, int nmax, const float * restrict x, uint8_t * restrict L, float * restrict the_min,
                   ^
      3 warnings generated.
      [5/12] Building CXX object vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o
      [6/12] Building C object vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2413:5: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion]
          GGML_F16_VEC_REDUCE(sumf, sum);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2045:37: note: expanded from macro 'GGML_F16_VEC_REDUCE'
      #define GGML_F16_VEC_REDUCE         GGML_F32Cx8_REDUCE
                                          ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2035:33: note: expanded from macro 'GGML_F32Cx8_REDUCE'
      #define GGML_F32Cx8_REDUCE      GGML_F32x8_REDUCE
                                      ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:1981:11: note: expanded from macro 'GGML_F32x8_REDUCE'
          res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1));                     \
              ~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:3456:9: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion]
              GGML_F16_VEC_REDUCE(sumf[k], sum[k]);
              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2045:37: note: expanded from macro 'GGML_F16_VEC_REDUCE'
      #define GGML_F16_VEC_REDUCE         GGML_F32Cx8_REDUCE
                                          ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2035:33: note: expanded from macro 'GGML_F32Cx8_REDUCE'
      #define GGML_F32Cx8_REDUCE      GGML_F32x8_REDUCE
                                      ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:1981:11: note: expanded from macro 'GGML_F32x8_REDUCE'
          res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1));                     \
              ~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:596:23: warning: unused function 'mul_sum_i8_pairs' [-Wunused-function]
      static inline __m128i mul_sum_i8_pairs(const __m128i x, const __m128i y) {
                            ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:627:19: warning: unused function 'hsum_i32_4' [-Wunused-function]
      static inline int hsum_i32_4(const __m128i a) {
                        ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:692:23: warning: unused function 'packNibbles' [-Wunused-function]
      static inline __m128i packNibbles( __m256i bytes )
                            ^
      5 warnings generated.
      [7/12] Linking C static library vendor/llama.cpp/libggml_static.a
      [8/12] Building CXX object vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o
      [9/12] Building CXX object vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      [10/12] Linking CXX shared library vendor/llama.cpp/libggml_shared.so
      FAILED: vendor/llama.cpp/libggml_shared.so
      : && /opt/rocm/llvm/bin/clang++ -fPIC -O3 -DNDEBUG   -shared -Wl,-soname,libggml_shared.so -o vendor/llama.cpp/libggml_shared.so vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o  -Wl,-rpath,/opt/rocm-5.6.0/lib:/opt/rocm/lib:  --hip-link  --offload-arch=gfx900  --offload-arch=gfx906  --offload-arch=gfx908  --offload-arch=gfx90a  --offload-arch=gfx1030  /opt/rocm-5.6.0/lib/libhipblas.so.1.0.50600  /opt/rocm-5.6.0/lib/librocblas.so.3.0.50600  /opt/rocm-5.6.0/llvm/lib/clang/16.0.0/lib/linux/libclang_rt.builtins-x86_64.a  /opt/rocm/lib/libamdhip64.so.5.6.50600 && :
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against symbol '__gxx_personality_v0'; recompile with -fPIC
      >>> defined in /usr/lib/gcc/x86_64-linux-gnu/11/libstdc++.so
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x7343)

      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x7361)

      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x7399)

      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      llama-cpp-python/vendor/llama.cpp$ git log | head -3
commit 66874d4fbcc7866377246efbcee938e8cc9c7d76
Author: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>
Date:   Thu May 25 20:18:01 2023 -0600
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors)
      clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
      [11/12] Linking CXX shared library vendor/llama.cpp/libllama.so
      FAILED: vendor/llama.cpp/libllama.so
      : && /opt/rocm/llvm/bin/clang++ -fPIC -O3 -DNDEBUG   -shared -Wl,-soname,libllama.so -o vendor/llama.cpp/libllama.so vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o  -Wl,-rpath,/opt/rocm-5.6.0/lib:/opt/rocm/lib:  --hip-link  --offload-arch=gfx900  --offload-arch=gfx906  --offload-arch=gfx908  --offload-arch=gfx90a  --offload-arch=gfx1030  /opt/rocm-5.6.0/lib/libhipblas.so.1.0.50600  /opt/rocm-5.6.0/lib/librocblas.so.3.0.50600  /opt/rocm-5.6.0/llvm/lib/clang/16.0.0/lib/linux/libclang_rt.builtins-x86_64.a  /opt/rocm/lib/libamdhip64.so.5.6.50600 && :
      ld.lld: error: relocation R_X86_64_32 cannot be used against symbol '__gxx_personality_v0'; recompile with -fPIC
      >>> defined in /usr/lib/gcc/x86_64-linux-gnu/11/libstdc++.so
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x9E3F)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x9E5D)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x9E95)

      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors)
      clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
      ninja: build stopped: subcommand failed.
      Traceback (most recent call last):
        File "/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/setuptools_wrap.py", line 674, in setup
          cmkr.make(make_args, install_target=cmake_install_target, env=env)
        File "/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/cmaker.py", line 697, in make
          self.make_impl(clargs=clargs, config=config, source_dir=source_dir, install_target=install_target, env=env)
        File "/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/cmaker.py", line 742, in make_impl
          raise SKBuildError(msg)

      An error occurred while building with CMake.
        Command:
          /tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake --build . --target install --config Release --
        Install target:
          install
        Source directory:
          /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888
        Working directory:
          /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-build
      Please check the install target is valid and see CMake's output for more information.

      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

As reference here is what happens when i simpy run pip install llama-cpp-python

pip install llama-cpp-python
Collecting llama-cpp-python
  Using cached llama_cpp_python-0.1.82.tar.gz (1.8 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy>=1.20.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (1.25.2)
Collecting diskcache>=5.6.1
  Using cached diskcache-5.6.1-py3-none-any.whl (45 kB)
Requirement already satisfied: typing-extensions>=4.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (4.7.1)
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... done
  Created wheel for llama-cpp-python: filename=llama_cpp_python-0.1.82-cp310-cp310-linux_x86_64.whl size=593844 sha256=5523b29af1720e7931b4ca3caee8ebb65b502a8640db4f1e6a633eb7d444dff5
  Stored in directory: /root/.cache/pip/wheels/d5/5a/02/e3a3e540045da967de35d1ac2220a194e26e57b120bb46b466
Successfully built llama-cpp-python
Installing collected packages: diskcache, llama-cpp-python
Successfully installed diskcache-5.6.1 llama-cpp-python-0.1.82
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

After installation with this second method the code runs as expected and it utilizes the gpu

Steps to Reproduce

Make sure you have an AMD GPU

  1. Build a docker container with the dockerfile written above docker build --pull --rm -f "Dockerfile" -t llama-cpp-python-container:latest
  2. Run it docker run -it --device=/dev/kfd --device=/dev/dri llama-cpp-python-container bash
  3. try the two intallation methods CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python and pip install llama-cpp-python

Failure Logs

Environment info

llama-cpp-python$ pip list | egrep "uvicorn|fastapi|sse-starlette|numpy"
numpy              1.25.2

I'm not sure where to get the llama-cpp version

Francesco215 commented 1 year ago

Also, if you want I could make a pull request that adds the docker container for AMD folks with everything pre-installed along with some instructions

abetlen commented 1 year ago

@Francesco215 thanks for reporting this, looks like a llama.cpp linker error when building it as a shared library with the new ROCm support.

      [10/12] Linking CXX shared library vendor/llama.cpp/libggml_shared.so
      FAILED: vendor/llama.cpp/libggml_shared.so
      : && /opt/rocm/llvm/bin/clang++ -fPIC -O3 -DNDEBUG   -shared -Wl,-soname,libggml_shared.so -o vendor/llama.cpp/libggml_shared.so vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o  -Wl,-rpath,/opt/rocm-5.6.0/lib:/opt/rocm/lib:  --hip-link  --offload-arch=gfx900  --offload-arch=gfx906  --offload-arch=gfx908  --offload-arch=gfx90a  --offload-arch=gfx1030  /opt/rocm-5.6.0/lib/libhipblas.so.1.0.50600  /opt/rocm-5.6.0/lib/librocblas.so.3.0.50600  /opt/rocm-5.6.0/llvm/lib/clang/16.0.0/lib/linux/libclang_rt.builtins-x86_64.a  /opt/rocm/lib/libamdhip64.so.5.6.50600 && :
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)

      ld.lld: error: relocation R_X86_64_32 cannot be used against symbol '__gxx_personality_v0'; recompile with -fPIC
      >>> defined in /usr/lib/gcc/x86_64-linux-gnu/11/libstdc++.so
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x7343)

Unfortunately I don't have an AMD card with ROCm support on hand so I can't contribute much more than pointing you in that direction. I would try adding

if (BUILD_SHARED_LIBS)
    set_target_properties(ggml-rocm PROPERTIES POSITION_INDEPENDENT_CODE ON)
endif()

inside this section https://github.com/ggerganov/llama.cpp/blob/44c117f41ee01c5ac8fb86bba041f08d8b87b46d/CMakeLists.txt#L373

and see if that works.

abetlen commented 1 year ago

@Francesco215 also yes would really appreciate that Docker image!

davysson commented 1 year ago

Unfortunately I don't have an AMD card with ROCm support on hand so I can't contribute much more than pointing you in that direction. I would try adding

if (BUILD_SHARED_LIBS)
    set_target_properties(ggml-rocm PROPERTIES POSITION_INDEPENDENT_CODE ON)
endif()

inside this section https://github.com/ggerganov/llama.cpp/blob/44c117f41ee01c5ac8fb86bba041f08d8b87b46d/CMakeLists.txt#L373

and see if that works.

Just here to confirm that this solves the problem. I was having the same issue trying to run on my 6800 XT (also through a rocm container) and, after changing the CmakeLists the way you suggested, llama.cpp uses the GPU through ROCm as expected:

llm_load_tensors: ggml ctx size =    0.12 MB
llm_load_tensors: using ROCm for GPU acceleration
llm_load_tensors: mem required  =  107.54 MB (+  400.00 MB per state)
llm_load_tensors: offloading 40 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloading v cache to GPU
llm_load_tensors: offloading k cache to GPU
llm_load_tensors: offloaded 43/43 layers to GPU
llm_load_tensors: VRAM used: 9095 MB
Francesco215 commented 1 year ago

Installation stuff

Not sure if i did the procedure correctly, but for me the problem is still present

The procedure I did was this:

  1. Clone this repo
  2. Go to the vendor folder and clone the llama.cpp repo
  3. Edit as @abetlen suggested
  4. Return to the llama-cpp-python main folder
  5. run CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install .

The error it gives me is this:

/llama-cpp-python# CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install .               
Processing /llama-cpp-python
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting diskcache>=5.6.1
  Using cached diskcache-5.6.3-py3-none-any.whl (45 kB)
Requirement already satisfied: typing-extensions>=4.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python==0.1.83) (4.7.1)
Requirement already satisfied: numpy>=1.20.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python==0.1.83) (1.25.2)
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [105 lines of output]

      --------------------------------------------------------------------------------
      -- Trying 'Ninja' generator
      --------------------------------
      ---------------------------
      ----------------------
      -----------------
      ------------
      -------
      --
      CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
        Compatibility with CMake < 3.5 will be removed from a future version of
        CMake.

        Update the VERSION argument <min> value or use a ...<max> suffix to tell
        CMake that the project does not need compatibility with older versions.

      Not searching for unused variables given on the command line.

      -- The C compiler identification is Clang 16.0.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- The CXX compiler identification is Clang 16.0.0
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Configuring done (0.4s)
      -- Generating done (0.0s)
      -- Build files have been written to: /llama-cpp-python/_cmake_test_compile/build
      --
      -------
      ------------
      -----------------
      ----------------------
      ---------------------------
      --------------------------------
      -- Trying 'Ninja' generator - success
      --------------------------------------------------------------------------------

      Configuring Project
        Working directory:
          /llama-cpp-python/_skbuild/linux-x86_64-3.10/cmake-build
        Command:
          /tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake /llama-cpp-python -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/llama-cpp-python/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.12 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython_EXECUTABLE:PATH=/usr/bin/python3 -DPython_ROOT_DIR:PATH=/usr -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython3_EXECUTABLE:PATH=/usr/bin/python3 -DPython3_ROOT_DIR:PATH=/usr -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja -DLLAMA_HIPBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_HIPBLAS=on

      Not searching for unused variables given on the command line.
      -- Found Git: /usr/bin/git (found version "2.34.1")
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
      -- Found Threads: TRUE
      CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
        Compatibility with CMake < 3.5 will be removed from a future version of
        CMake.

        Update the VERSION argument <min> value or use a ...<max> suffix to tell
        CMake that the project does not need compatibility with older versions.
      Call Stack (most recent call first):
        vendor/llama.cpp/CMakeLists.txt:366 (find_package)

      -- hip::amdhip64 is SHARED_LIBRARY
      -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
      -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success
      CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
        Compatibility with CMake < 3.5 will be removed from a future version of
        CMake.

        Update the VERSION argument <min> value or use a ...<max> suffix to tell
        CMake that the project does not need compatibility with older versions.
      Call Stack (most recent call first):
        /tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.27/Modules/CMakeFindDependencyMacro.cmake:76 (find_package)
        /opt/rocm/lib/cmake/hipblas/hipblas-config.cmake:90 (find_dependency)
        vendor/llama.cpp/CMakeLists.txt:367 (find_package)

      -- hip::amdhip64 is SHARED_LIBRARY
      -- HIP and hipBLAS found
      CMake Error at vendor/llama.cpp/CMakeLists.txt:374 (set_target_properties):
        set_target_properties Can not find target to add properties to: ggml-rocm

      -- CMAKE_SYSTEM_PROCESSOR: x86_64
      -- x86 detected
      -- Configuring incomplete, errors occurred!
      Traceback (most recent call last):
        File "/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/skbuild/setuptools_wrap.py", line 666, in setup
          env = cmkr.configure(
        File "/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/skbuild/cmaker.py", line 357, in configure
          raise SKBuildError(msg)

      An error occurred while configuring with CMake.
        Command:
          /tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake /llama-cpp-python -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/llama-cpp-python/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.12 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython_EXECUTABLE:PATH=/usr/bin/python3 -DPython_ROOT_DIR:PATH=/usr -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython3_EXECUTABLE:PATH=/usr/bin/python3 -DPython3_ROOT_DIR:PATH=/usr -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja -DLLAMA_HIPBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_HIPBLAS=on
        Source directory:
          /llama-cpp-python
        Working directory:
          /llama-cpp-python/_skbuild/linux-x86_64-3.10/cmake-build
      Please see CMake's output for more information.

      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

Some extra notes

Did I miss something?

Docker stuff

As far as the docker container is concerned, I'm having a weird problem.

Basically, if I spin up a docker container thought the docker image that is written at the start of the issue, and then once open I run pip install llama-cpp-python everything is fine.

If on the other hand I write in the last line of the dockerfile RUN pip install llama-cpp-python the thing doesn't work

@davysson does the same thing happen to you?

davysson commented 1 year ago

@Francesco215 you need to add the change after the "add_library" (line 373), otherwise CMake won't find the library.

It should be like this:

...
        add_library(ggml-rocm OBJECT ggml-cuda.cu ggml-cuda.h)
        if (BUILD_SHARED_LIBS)
            set_target_properties(ggml-rocm PROPERTIES POSITION_INDEPENDENT_CODE ON)
        endif()
....

About the container, I didn't have any problem, only difference is that a use rocm/rocm-terminal:latest. If it helps, here are my Dockerfile and install script.

koalaeagle commented 1 year ago

I am struggling with this too. Unfortunately I can't view that linked Dockerfile or install script (404).

Void-025 commented 1 year ago

I followed the instructions here but trying to generate anything gives me this error: CUDA error 98 at [...]/llama-cpp-python/vendor/llama.cpp/ggml-cuda.cu:6046: invalid device function Am I doing something wrong?

abetlen commented 1 year ago

@Francesco215 I got the PR merged with the fix in llama.cpp and it's now in 0.2.6

Francesco215 commented 1 year ago

Thanks!

abetlen commented 1 year ago

@Francesco215 can I close this up then?

teleprint-me commented 1 year ago

My issue is not related to docker, but I've been using this and #695 as a guideline. I haven't been able to build the wheel.

I'm running into issues with building for AMD GPU as well when running the flag and decided to take a look at llama.cpp to see if I could isolate the issue.

I beginning to think this is not a llama-cpp-python issue at all.

I'm working on it and if I can find a solution, I'll most likely post to the AMD GPU thread for llama.cpp instead because I think it's out of scope.

A good litmus test might be to compile llama.cpp in isolation and see if the same error is generated once again (this is what I did and I was able to duplicate the issue).

DaniDD commented 1 year ago

I have a problem with the installation too. I have ROCm 5.7 installed and python 3.10 llame.cpp (whitout python) I can build for ROCm, but the python version not.

` CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python Collecting llama-cpp-python Using cached llama_cpp_python-0.2.11.tar.gz (3.6 MB) Installing build dependencies ... done Getting requirements to build wheel ... done Installing backend dependencies ... done Preparing metadata (pyproject.toml) ... done Requirement already satisfied: typing-extensions>=4.5.0 in /home/user/rocm/lib/python3.10/site-packages (from llama-cpp-python) (4.8.0) Requirement already satisfied: numpy>=1.20.0 in /home/user/rocm/lib/python3.10/site-packages (from llama-cpp-python) (1.24.0) Requirement already satisfied: diskcache>=5.6.1 in /home/user/rocm/lib/python3.10/site-packages (from llama-cpp-python) (5.6.3) Building wheels for collected packages: llama-cpp-python Building wheel for llama-cpp-python (pyproject.toml) ... error error: subprocess-exited-with-error

× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [114 lines of output] scikit-build-core 0.5.1 using CMake 3.27.6 (wheel) Configuring CMake... loading initial cache file /tmp/tmpzfkz647x/build/CMakeInit.txt -- The C compiler identification is GNU 11.4.0 -- The CXX compiler identification is GNU 11.4.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/bin/git (found version "2.34.1") -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE CMake Warning at vendor/llama.cpp/CMakeLists.txt:380 (message): Only LLVM is supported for HIP, hint: CC=/opt/rocm/llvm/bin/clang

  CMake Warning at vendor/llama.cpp/CMakeLists.txt:383 (message):
    Only LLVM is supported for HIP, hint: CXX=/opt/rocm/llvm/bin/clang++

  CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
    Compatibility with CMake < 3.5 will be removed from a future version of
    CMake.

    Update the VERSION argument <min> value or use a ...<max> suffix to tell
    CMake that the project does not need compatibility with older versions.
  Call Stack (most recent call first):
    vendor/llama.cpp/CMakeLists.txt:386 (find_package)

  CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config-amd.cmake:21 (cmake_minimum_required):
    Compatibility with CMake < 3.5 will be removed from a future version of
    CMake.

    Update the VERSION argument <min> value or use a ...<max> suffix to tell
    CMake that the project does not need compatibility with older versions.
  Call Stack (most recent call first):
    /opt/rocm/lib/cmake/hip/hip-config.cmake:150 (include)
    vendor/llama.cpp/CMakeLists.txt:386 (find_package)

  -- hip::amdhip64 is SHARED_LIBRARY
  -- /usr/bin/c++: CLANGRT compiler options not supported.
  CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
    Compatibility with CMake < 3.5 will be removed from a future version of
    CMake.

    Update the VERSION argument <min> value or use a ...<max> suffix to tell
    CMake that the project does not need compatibility with older versions.
  Call Stack (most recent call first):
    /tmp/pip-build-env-ltem499n/normal/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeFindDependencyMacro.cmake:76 (find_package)
    /opt/rocm/lib/cmake/hipblas/hipblas-config.cmake:90 (find_dependency)
    vendor/llama.cpp/CMakeLists.txt:387 (find_package)

  CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config-amd.cmake:21 (cmake_minimum_required):
    Compatibility with CMake < 3.5 will be removed from a future version of
    CMake.

    Update the VERSION argument <min> value or use a ...<max> suffix to tell
    CMake that the project does not need compatibility with older versions.
  Call Stack (most recent call first):
    /opt/rocm/lib/cmake/hip/hip-config.cmake:150 (include)
    /tmp/pip-build-env-ltem499n/normal/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeFindDependencyMacro.cmake:76 (find_package)
    /opt/rocm/lib/cmake/hipblas/hipblas-config.cmake:90 (find_dependency)
    vendor/llama.cpp/CMakeLists.txt:387 (find_package)

  -- hip::amdhip64 is SHARED_LIBRARY
  -- /usr/bin/c++: CLANGRT compiler options not supported.
  -- HIP and hipBLAS found
  -- CMAKE_SYSTEM_PROCESSOR: x86_64
  -- x86 detected
  CMake Warning (dev) at CMakeLists.txt:18 (install):
    Target llama has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
  This warning is for project developers.  Use -Wno-dev to suppress it.

  CMake Warning (dev) at CMakeLists.txt:27 (install):
    Target llama has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
  This warning is for project developers.  Use -Wno-dev to suppress it.

  -- Configuring done (0.3s)
  -- Generating done (0.0s)
  -- Build files have been written to: /tmp/tmpzfkz647x/build
  *** Building project with Ninja...
  Change Dir: '/tmp/tmpzfkz647x/build'

  Run Build Command(s): /tmp/pip-build-env-ltem499n/normal/lib/python3.10/site-packages/ninja/data/bin/ninja -v
  [1/13] /usr/bin/c++ -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -DK_QUANTS_PER_ITERATION=2 -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -isystem /opt/rocm/include -isystem /opt/rocm-5.7.0/include -O3 -DNDEBUG -std=gnu++11 -fPIC -x hip -MD -MT vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o -MF vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o.d -o vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/ggml-cuda.cu
  FAILED: vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
  /usr/bin/c++ -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -DK_QUANTS_PER_ITERATION=2 -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -isystem /opt/rocm/include -isystem /opt/rocm-5.7.0/include -O3 -DNDEBUG -std=gnu++11 -fPIC -x hip -MD -MT vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o -MF vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o.d -o vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/ggml-cuda.cu
  c++: error: language hip not recognized
  c++: error: language hip not recognized
  [2/13] cd /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp && /tmp/pip-build-env-ltem499n/normal/lib/python3.10/site-packages/cmake/data/bin/cmake -DMSVC= -DCMAKE_C_COMPILER_VERSION=11.4.0 -DCMAKE_C_COMPILER_ID=GNU -DCMAKE_VS_PLATFORM_NAME= -DCMAKE_C_COMPILER=/usr/bin/cc -P /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/scripts/build-info.cmake
  -- Found Git: /usr/bin/git (found version "2.34.1")
  [3/13] /usr/bin/cc -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -isystem /opt/rocm/include -isystem /opt/rocm-5.7.0/include -O3 -DNDEBUG -std=gnu11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -Wdouble-promotion -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/ggml-alloc.c
  [4/13] /usr/bin/c++ -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/. -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -O3 -DNDEBUG -std=gnu++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Wno-array-bounds -Wno-format-truncation -Wextra-semi -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o -MF vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o.d -o vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/console.cpp
  [5/13] /usr/bin/c++ -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/. -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -O3 -DNDEBUG -std=gnu++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Wno-array-bounds -Wno-format-truncation -Wextra-semi -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o -MF vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o.d -o vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/grammar-parser.cpp
  [6/13] /usr/bin/cc -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -isystem /opt/rocm/include -isystem /opt/rocm-5.7.0/include -O3 -DNDEBUG -std=gnu11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -Wdouble-promotion -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/k_quants.c
  [7/13] /usr/bin/c++ -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/. -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -O3 -DNDEBUG -std=gnu++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Wno-array-bounds -Wno-format-truncation -Wextra-semi -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/common/CMakeFiles/common.dir/train.cpp.o -MF vendor/llama.cpp/common/CMakeFiles/common.dir/train.cpp.o.d -o vendor/llama.cpp/common/CMakeFiles/common.dir/train.cpp.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/train.cpp
  [8/13] /usr/bin/c++ -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/. -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -O3 -DNDEBUG -std=gnu++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Wno-array-bounds -Wno-format-truncation -Wextra-semi -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o -MF vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o.d -o vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/common.cpp
  [9/13] /usr/bin/cc -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -isystem /opt/rocm/include -isystem /opt/rocm-5.7.0/include -O3 -DNDEBUG -std=gnu11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -Wdouble-promotion -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/ggml.c
  [10/13] /usr/bin/c++ -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -DLLAMA_BUILD -DLLAMA_SHARED -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -Dllama_EXPORTS -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -isystem /opt/rocm/include -isystem /opt/rocm-5.7.0/include -O3 -DNDEBUG -std=gnu++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Wno-array-bounds -Wno-format-truncation -Wextra-semi -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o -MF vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o.d -o vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/llama.cpp
  ninja: build stopped: subcommand failed.

  *** CMake build failed
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for llama-cpp-python Failed to build llama-cpp-python ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects`

teleprint-me commented 1 year ago

@DaniDD Try looking into iss #695. Similar reports there as well. They're definitely related.

rongxanh88 commented 9 months ago

I have a similar but slightly different problem. I'm running ROCM 6.0 on Ubuntu 22 and I instally pytorch with rocm 5.7. Now I updated the above install command for llama-cpp-python with the correct references for Rocm 6.0 and a 7900xt GPU (GFX 1100)

CMAKE_ARGS="-D LLAMA_HIPBLAS=ON -D CMAKE_C_COMPILER=/opt/rocm/bin/amdclang -D CMAKE_CXX_COMPILER=/opt/rocm/bin/amdclang++ -D CMAKE_PREFIX_PATH=/opt/rocm -D AMDGPU_TARGETS=gfx1100" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.29 --upgrade --force-reinstall --no-cache-dir

This resulted in the following error

ERROR: Failed building wheel for llama-cpp-python

which is due to this suberror error: unable to find library -lstdc++

I fixed it by installing the latest g++ libraries with

sudo apt install libstdc++-12-dev
sudo apt install libstdc++-12-doc