Closed melindmi closed 8 months ago
In case someone else encounters the same issue, the problem was caused by having a nvcc version not compatible with the GPU driver version.
When installing with pip install ctransformers[cuda]
precompiled libs for CUDA 12.2 are used, but in my cases I needed CUDA version 12.0.
If I used CT_CUBLAS=1 pip install ctransformers --no-binary ctransformers
by default the CUDA compiler path was /usr/bin/ which in my case had an older version of nvcc.
The solution was to install the right CUDA version in a different path and then install ctransformers with:
CMAKE_ARGS="-DCMAKE_CUDA_COMPILER=/path_to_cuda/bin/nvcc" CT_CUBLAS=1 pip install ctransformers --no-binary ctransformers
Hi, I am trying to use the llama-2-70b-chat.Q5_K_M.gguf model with ctransformers on GPU but I get this error:
CUDA error 222 at /home/runner/work/ctransformers/ctransformers/models/ggml/ggml-cuda.cu:6045: the provided PTX was compiled with an unsupported toolchain.
My torch version is '2.1.0+cu121' and the GPU Driver Version: 525.125.06 supporting CUDA Version: 12.0.The code:
llm = AutoModelForCausalLM.from_pretrained("../llama", model_file="llama-2-13b-chat.q5_K_M.gguf", model_type="llama", gpu_layers=50, temperature=1, context_length=4096)
Can someone suggest something on this?