Open Francesco215 opened 1 year ago
Also, if you want I could make a pull request that adds the docker container for AMD folks with everything pre-installed along with some instructions
@Francesco215 thanks for reporting this, looks like a llama.cpp linker error when building it as a shared library with the new ROCm support.
[10/12] Linking CXX shared library vendor/llama.cpp/libggml_shared.so
FAILED: vendor/llama.cpp/libggml_shared.so
: && /opt/rocm/llvm/bin/clang++ -fPIC -O3 -DNDEBUG -shared -Wl,-soname,libggml_shared.so -o vendor/llama.cpp/libggml_shared.so vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o -Wl,-rpath,/opt/rocm-5.6.0/lib:/opt/rocm/lib: --hip-link --offload-arch=gfx900 --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 /opt/rocm-5.6.0/lib/libhipblas.so.1.0.50600 /opt/rocm-5.6.0/lib/librocblas.so.3.0.50600 /opt/rocm-5.6.0/llvm/lib/clang/16.0.0/lib/linux/libclang_rt.builtins-x86_64.a /opt/rocm/lib/libamdhip64.so.5.6.50600 && :
ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
>>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
ld.lld: error: relocation R_X86_64_32 cannot be used against symbol '__gxx_personality_v0'; recompile with -fPIC
>>> defined in /usr/lib/gcc/x86_64-linux-gnu/11/libstdc++.so
>>> referenced by ggml-cuda.cu
>>> vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x7343)
Unfortunately I don't have an AMD card with ROCm support on hand so I can't contribute much more than pointing you in that direction. I would try adding
if (BUILD_SHARED_LIBS)
set_target_properties(ggml-rocm PROPERTIES POSITION_INDEPENDENT_CODE ON)
endif()
inside this section https://github.com/ggerganov/llama.cpp/blob/44c117f41ee01c5ac8fb86bba041f08d8b87b46d/CMakeLists.txt#L373
and see if that works.
@Francesco215 also yes would really appreciate that Docker image!
Unfortunately I don't have an AMD card with ROCm support on hand so I can't contribute much more than pointing you in that direction. I would try adding
if (BUILD_SHARED_LIBS) set_target_properties(ggml-rocm PROPERTIES POSITION_INDEPENDENT_CODE ON) endif()
inside this section https://github.com/ggerganov/llama.cpp/blob/44c117f41ee01c5ac8fb86bba041f08d8b87b46d/CMakeLists.txt#L373
and see if that works.
Just here to confirm that this solves the problem. I was having the same issue trying to run on my 6800 XT (also through a rocm container) and, after changing the CmakeLists the way you suggested, llama.cpp uses the GPU through ROCm as expected:
llm_load_tensors: ggml ctx size = 0.12 MB
llm_load_tensors: using ROCm for GPU acceleration
llm_load_tensors: mem required = 107.54 MB (+ 400.00 MB per state)
llm_load_tensors: offloading 40 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloading v cache to GPU
llm_load_tensors: offloading k cache to GPU
llm_load_tensors: offloaded 43/43 layers to GPU
llm_load_tensors: VRAM used: 9095 MB
Not sure if i did the procedure correctly, but for me the problem is still present
The procedure I did was this:
CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install .
The error it gives me is this:
/llama-cpp-python# CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install .
Processing /llama-cpp-python
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting diskcache>=5.6.1
Using cached diskcache-5.6.3-py3-none-any.whl (45 kB)
Requirement already satisfied: typing-extensions>=4.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python==0.1.83) (4.7.1)
Requirement already satisfied: numpy>=1.20.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python==0.1.83) (1.25.2)
Building wheels for collected packages: llama-cpp-python
Building wheel for llama-cpp-python (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [105 lines of output]
--------------------------------------------------------------------------------
-- Trying 'Ninja' generator
--------------------------------
---------------------------
----------------------
-----------------
------------
-------
--
CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
Not searching for unused variables given on the command line.
-- The C compiler identification is Clang 16.0.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- The CXX compiler identification is Clang 16.0.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done (0.4s)
-- Generating done (0.0s)
-- Build files have been written to: /llama-cpp-python/_cmake_test_compile/build
--
-------
------------
-----------------
----------------------
---------------------------
--------------------------------
-- Trying 'Ninja' generator - success
--------------------------------------------------------------------------------
Configuring Project
Working directory:
/llama-cpp-python/_skbuild/linux-x86_64-3.10/cmake-build
Command:
/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake /llama-cpp-python -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/llama-cpp-python/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.12 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython_EXECUTABLE:PATH=/usr/bin/python3 -DPython_ROOT_DIR:PATH=/usr -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython3_EXECUTABLE:PATH=/usr/bin/python3 -DPython3_ROOT_DIR:PATH=/usr -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja -DLLAMA_HIPBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_HIPBLAS=on
Not searching for unused variables given on the command line.
-- Found Git: /usr/bin/git (found version "2.34.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
Call Stack (most recent call first):
vendor/llama.cpp/CMakeLists.txt:366 (find_package)
-- hip::amdhip64 is SHARED_LIBRARY
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success
CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
Call Stack (most recent call first):
/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.27/Modules/CMakeFindDependencyMacro.cmake:76 (find_package)
/opt/rocm/lib/cmake/hipblas/hipblas-config.cmake:90 (find_dependency)
vendor/llama.cpp/CMakeLists.txt:367 (find_package)
-- hip::amdhip64 is SHARED_LIBRARY
-- HIP and hipBLAS found
CMake Error at vendor/llama.cpp/CMakeLists.txt:374 (set_target_properties):
set_target_properties Can not find target to add properties to: ggml-rocm
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
File "/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/skbuild/setuptools_wrap.py", line 666, in setup
env = cmkr.configure(
File "/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/skbuild/cmaker.py", line 357, in configure
raise SKBuildError(msg)
An error occurred while configuring with CMake.
Command:
/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake /llama-cpp-python -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/llama-cpp-python/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.12 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython_EXECUTABLE:PATH=/usr/bin/python3 -DPython_ROOT_DIR:PATH=/usr -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython3_EXECUTABLE:PATH=/usr/bin/python3 -DPython3_ROOT_DIR:PATH=/usr -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-0wgu91bg/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja -DLLAMA_HIPBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_HIPBLAS=on
Source directory:
/llama-cpp-python
Working directory:
/llama-cpp-python/_skbuild/linux-x86_64-3.10/cmake-build
Please see CMake's output for more information.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects
pip install .
everything goes finepip install .
The code doesn't workvendor/llama.cpp/CMakeList.txt
fileDid I miss something?
As far as the docker container is concerned, I'm having a weird problem.
Basically, if I spin up a docker container thought the docker image that is written at the start of the issue, and then once open I run pip install llama-cpp-python
everything is fine.
If on the other hand I write in the last line of the dockerfile RUN pip install llama-cpp-python
the thing doesn't work
@davysson does the same thing happen to you?
@Francesco215 you need to add the change after the "add_library" (line 373), otherwise CMake won't find the library.
It should be like this:
...
add_library(ggml-rocm OBJECT ggml-cuda.cu ggml-cuda.h)
if (BUILD_SHARED_LIBS)
set_target_properties(ggml-rocm PROPERTIES POSITION_INDEPENDENT_CODE ON)
endif()
....
About the container, I didn't have any problem, only difference is that a use rocm/rocm-terminal:latest. If it helps, here are my Dockerfile and install script.
I am struggling with this too. Unfortunately I can't view that linked Dockerfile or install script (404).
I followed the instructions here but trying to generate anything gives me this error:
CUDA error 98 at [...]/llama-cpp-python/vendor/llama.cpp/ggml-cuda.cu:6046: invalid device function
Am I doing something wrong?
@Francesco215 I got the PR merged with the fix in llama.cpp and it's now in 0.2.6
Thanks!
@Francesco215 can I close this up then?
My issue is not related to docker, but I've been using this and #695 as a guideline. I haven't been able to build the wheel.
I'm running into issues with building for AMD GPU as well when running the flag and decided to take a look at llama.cpp to see if I could isolate the issue.
I beginning to think this is not a llama-cpp-python issue at all.
I'm working on it and if I can find a solution, I'll most likely post to the AMD GPU thread for llama.cpp instead because I think it's out of scope.
A good litmus test might be to compile llama.cpp in isolation and see if the same error is generated once again (this is what I did and I was able to duplicate the issue).
I have a problem with the installation too. I have ROCm 5.7 installed and python 3.10 llame.cpp (whitout python) I can build for ROCm, but the python version not.
` CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python Collecting llama-cpp-python Using cached llama_cpp_python-0.2.11.tar.gz (3.6 MB) Installing build dependencies ... done Getting requirements to build wheel ... done Installing backend dependencies ... done Preparing metadata (pyproject.toml) ... done Requirement already satisfied: typing-extensions>=4.5.0 in /home/user/rocm/lib/python3.10/site-packages (from llama-cpp-python) (4.8.0) Requirement already satisfied: numpy>=1.20.0 in /home/user/rocm/lib/python3.10/site-packages (from llama-cpp-python) (1.24.0) Requirement already satisfied: diskcache>=5.6.1 in /home/user/rocm/lib/python3.10/site-packages (from llama-cpp-python) (5.6.3) Building wheels for collected packages: llama-cpp-python Building wheel for llama-cpp-python (pyproject.toml) ... error error: subprocess-exited-with-error
× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [114 lines of output] scikit-build-core 0.5.1 using CMake 3.27.6 (wheel) Configuring CMake... loading initial cache file /tmp/tmpzfkz647x/build/CMakeInit.txt -- The C compiler identification is GNU 11.4.0 -- The CXX compiler identification is GNU 11.4.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/bin/git (found version "2.34.1") -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE CMake Warning at vendor/llama.cpp/CMakeLists.txt:380 (message): Only LLVM is supported for HIP, hint: CC=/opt/rocm/llvm/bin/clang
CMake Warning at vendor/llama.cpp/CMakeLists.txt:383 (message):
Only LLVM is supported for HIP, hint: CXX=/opt/rocm/llvm/bin/clang++
CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
Call Stack (most recent call first):
vendor/llama.cpp/CMakeLists.txt:386 (find_package)
CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config-amd.cmake:21 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
Call Stack (most recent call first):
/opt/rocm/lib/cmake/hip/hip-config.cmake:150 (include)
vendor/llama.cpp/CMakeLists.txt:386 (find_package)
-- hip::amdhip64 is SHARED_LIBRARY
-- /usr/bin/c++: CLANGRT compiler options not supported.
CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
Call Stack (most recent call first):
/tmp/pip-build-env-ltem499n/normal/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeFindDependencyMacro.cmake:76 (find_package)
/opt/rocm/lib/cmake/hipblas/hipblas-config.cmake:90 (find_dependency)
vendor/llama.cpp/CMakeLists.txt:387 (find_package)
CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config-amd.cmake:21 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
Call Stack (most recent call first):
/opt/rocm/lib/cmake/hip/hip-config.cmake:150 (include)
/tmp/pip-build-env-ltem499n/normal/lib/python3.10/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeFindDependencyMacro.cmake:76 (find_package)
/opt/rocm/lib/cmake/hipblas/hipblas-config.cmake:90 (find_dependency)
vendor/llama.cpp/CMakeLists.txt:387 (find_package)
-- hip::amdhip64 is SHARED_LIBRARY
-- /usr/bin/c++: CLANGRT compiler options not supported.
-- HIP and hipBLAS found
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
CMake Warning (dev) at CMakeLists.txt:18 (install):
Target llama has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
This warning is for project developers. Use -Wno-dev to suppress it.
CMake Warning (dev) at CMakeLists.txt:27 (install):
Target llama has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
This warning is for project developers. Use -Wno-dev to suppress it.
-- Configuring done (0.3s)
-- Generating done (0.0s)
-- Build files have been written to: /tmp/tmpzfkz647x/build
*** Building project with Ninja...
Change Dir: '/tmp/tmpzfkz647x/build'
Run Build Command(s): /tmp/pip-build-env-ltem499n/normal/lib/python3.10/site-packages/ninja/data/bin/ninja -v
[1/13] /usr/bin/c++ -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -DK_QUANTS_PER_ITERATION=2 -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -isystem /opt/rocm/include -isystem /opt/rocm-5.7.0/include -O3 -DNDEBUG -std=gnu++11 -fPIC -x hip -MD -MT vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o -MF vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o.d -o vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/ggml-cuda.cu
FAILED: vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
/usr/bin/c++ -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -DK_QUANTS_PER_ITERATION=2 -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -isystem /opt/rocm/include -isystem /opt/rocm-5.7.0/include -O3 -DNDEBUG -std=gnu++11 -fPIC -x hip -MD -MT vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o -MF vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o.d -o vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/ggml-cuda.cu
c++: error: language hip not recognized
c++: error: language hip not recognized
[2/13] cd /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp && /tmp/pip-build-env-ltem499n/normal/lib/python3.10/site-packages/cmake/data/bin/cmake -DMSVC= -DCMAKE_C_COMPILER_VERSION=11.4.0 -DCMAKE_C_COMPILER_ID=GNU -DCMAKE_VS_PLATFORM_NAME= -DCMAKE_C_COMPILER=/usr/bin/cc -P /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/scripts/build-info.cmake
-- Found Git: /usr/bin/git (found version "2.34.1")
[3/13] /usr/bin/cc -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -isystem /opt/rocm/include -isystem /opt/rocm-5.7.0/include -O3 -DNDEBUG -std=gnu11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -Wdouble-promotion -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/ggml-alloc.c
[4/13] /usr/bin/c++ -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/. -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -O3 -DNDEBUG -std=gnu++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Wno-array-bounds -Wno-format-truncation -Wextra-semi -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o -MF vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o.d -o vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/console.cpp
[5/13] /usr/bin/c++ -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/. -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -O3 -DNDEBUG -std=gnu++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Wno-array-bounds -Wno-format-truncation -Wextra-semi -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o -MF vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o.d -o vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/grammar-parser.cpp
[6/13] /usr/bin/cc -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -isystem /opt/rocm/include -isystem /opt/rocm-5.7.0/include -O3 -DNDEBUG -std=gnu11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -Wdouble-promotion -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/k_quants.c
[7/13] /usr/bin/c++ -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/. -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -O3 -DNDEBUG -std=gnu++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Wno-array-bounds -Wno-format-truncation -Wextra-semi -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/common/CMakeFiles/common.dir/train.cpp.o -MF vendor/llama.cpp/common/CMakeFiles/common.dir/train.cpp.o.d -o vendor/llama.cpp/common/CMakeFiles/common.dir/train.cpp.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/train.cpp
[8/13] /usr/bin/c++ -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/. -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -O3 -DNDEBUG -std=gnu++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Wno-array-bounds -Wno-format-truncation -Wextra-semi -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o -MF vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o.d -o vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/common/common.cpp
[9/13] /usr/bin/cc -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -isystem /opt/rocm/include -isystem /opt/rocm-5.7.0/include -O3 -DNDEBUG -std=gnu11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -Wdouble-promotion -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o -MF vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o.d -o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/ggml.c
[10/13] /usr/bin/c++ -DGGML_USE_CUBLAS -DGGML_USE_HIPBLAS -DGGML_USE_K_QUANTS -DLLAMA_BUILD -DLLAMA_SHARED -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -Dllama_EXPORTS -I/tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/. -isystem /opt/rocm/include -isystem /opt/rocm-5.7.0/include -O3 -DNDEBUG -std=gnu++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -Wno-array-bounds -Wno-format-truncation -Wextra-semi -mf16c -mfma -mavx -mavx2 -MD -MT vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o -MF vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o.d -o vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o -c /tmp/pip-install-_x9a148g/llama-cpp-python_07818dae4b08428abdb523d8d3aa67c2/vendor/llama.cpp/llama.cpp
ninja: build stopped: subcommand failed.
*** CMake build failed
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for llama-cpp-python Failed to build llama-cpp-python ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects`
@DaniDD Try looking into iss #695. Similar reports there as well. They're definitely related.
I have a similar but slightly different problem. I'm running ROCM 6.0 on Ubuntu 22 and I instally pytorch with rocm 5.7. Now I updated the above install command for llama-cpp-python with the correct references for Rocm 6.0 and a 7900xt GPU (GFX 1100)
CMAKE_ARGS="-D LLAMA_HIPBLAS=ON -D CMAKE_C_COMPILER=/opt/rocm/bin/amdclang -D CMAKE_CXX_COMPILER=/opt/rocm/bin/amdclang++ -D CMAKE_PREFIX_PATH=/opt/rocm -D AMDGPU_TARGETS=gfx1100" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.29 --upgrade --force-reinstall --no-cache-dir
This resulted in the following error
ERROR: Failed building wheel for llama-cpp-python
which is due to this suberror error: unable to find library -lstdc++
I fixed it by installing the latest g++ libraries with
sudo apt install libstdc++-12-dev
sudo apt install libstdc++-12-doc
Expected Behavior
I have a machine with and AMD GPU (Radeon RX 7900 XT). I tried to install this library as written in the README by running
CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
Current Behavior
The installation fails, however when I simply run
pip install llama-cpp-python
it worksEnvironment and Context
To make the issue reproducible i made a Docker conatiner with this Dockerfile (adapted from the llama-cpp repo)
System Info:
CPU: 13th Gen Intel(R) Core(TM) i5-13400F GPU: Radeon RX 7900 XT
Ubuntu 22.04.1
Python 3.10.6 Make 4.3 g++ 11.3.0
Failure Information (for bugs)
The installation failed, here is the output when running
CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
As reference here is what happens when i simpy run
pip install llama-cpp-python
After installation with this second method the code runs as expected and it utilizes the gpu
Steps to Reproduce
Make sure you have an AMD GPU
docker build --pull --rm -f "Dockerfile" -t llama-cpp-python-container:latest
docker run -it --device=/dev/kfd --device=/dev/dri llama-cpp-python-container bash
CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
andpip install llama-cpp-python
Failure Logs
Environment info
I'm not sure where to get the llama-cpp version