marella / ctransformers

Python bindings for the Transformer models implemented in C/C++ using GGML library.
MIT License
1.82k stars 138 forks source link

CUDA error 222 at D:\a\ctransformers\ctransformers\models\ggml\ggml-cuda.cu:6045: the provided PTX was compiled with an unsupported toolchain. #165

Open AnhNgDo opened 1 year ago

AnhNgDo commented 1 year ago

Hello, I'm trying to use ctransformers as below:

from ctransformers import AutoModelForCausalLM

# Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
llm = AutoModelForCausalLM.from_pretrained("TheBloke/zephyr-7B-alpha-GGUF", model_file="zephyr-7b-alpha.Q4_K_M.gguf", model_type="mistral", gpu_layers=50)

print(llm("AI is going to"))

Got this error: CUDA error 222 at D:\a\ctransformers\ctransformers\models\ggml\ggml-cuda.cu:6045: the provided PTX was compiled with an unsupported toolchain.

Not sure where to go from here :( Any help will be much appreciated!

AlexBlack2202 commented 1 year ago

i have same problem, and it is not fix

as i guest that GGUF format doesn't support gpu

parth-verma7 commented 10 months ago

i have same problem, and it is not fix

as i guest that GGUF format doesn't support gpu

Did u find the solution??