Closed 9600- closed 2 months ago
Am able to build llama_cpp_python_cuda-0.2.81
using CMAKE_ARGS="-DLLAVA_BUILD=OFF"
However, model loads fail with the following error in TGW.
Exception: Cannot import 'llama_cpp_cuda' because 'llama_cpp' is already imported. See issue #1575 in llama-cpp-python. Please restart the server before attempting to use a different version of llama-cpp-python.
Have the same issue with RTX 4090 card + Ubuntu 20.04 + CUDA 12.1
@congson1293 what type of CPU are you running?
@9600- The bug exists in both Intel-CPU and AMD-CPU
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
Attempt to load GGUF, or report an error.
Current Behavior
Switching back to
llama_cpp_python_cuda-0.2.79
resolves the issue.I can build
llama_cpp_python_cuda-0.2.81
successfully usingCMAKE_ARGS="-DLLAVA_BUILD=OFF"
, but receive the following error in TGW during model loads.This error message references #1575.
Environment and Context
Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.
$ lscpu
Failure Information (for bugs)
Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.
Steps to Reproduce
Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.
Illegal instruction (core dumped)
To see build error:
Failure Logs
When trying to load model:
When trying to build: