Open harrypale opened 1 year ago
@harrypale can you try to install llama.cpp using CMAKE with the same setup and post those build logs, may be able to help.
I ran into the same problem with almost the same setup (I'm running a GTX 970 4GB):
cuBLAS error 15 at C:\Users\$USER\AppData\Local\Temp\pip-install-odq455g9\llama-cpp-python_6254bfd1892540b2aada837341f178f9\vendor\llama.cpp\ggml-cuda.cu:7594: the requested functionality is not supported.
My install starts without a problem but when I try to prompt privateGPT it crashes and produces this error. Help is greatly appreciated!
UPDATE: A friend of mine found the source of the problem. Our GPU doesn't support half-precision floating point operations which is required by llama. Our hardware supported feature version is 5.2 but privateGPT was tested on 8.6: I guess this means it's over with GPU support for us :(
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
I am trying to run privateGPT with CUDA. Without CUDA it works. With CUDA enabled, 'llama-cpp-python' crashes.
Current Behavior
When I start privateGPT, llama-cpp-python-main gives this error :
cuBLAS error 15 at C:\Users\Harry\Documents\llama-cpp-python-main\vendor\llama.cpp\ggml-cuda.cu:7586: the requested functionality is not supported
Environment and Context
I am on Windows 11, with NVIDIA CUDA Toolkit 12.2 installed, NVIDIA GeForce GTX 960 4GB and 32GB RAM. CPU is Intel i7 2600K (it should support AVX).
Failure Information (for bugs)
cuBLAS error 15 at C:\Users\Harry\Documents\llama-cpp-python-main\vendor\llama.cpp\ggml-cuda.cu:7586: the requested functionality is not supported
Steps to Reproduce
Let me know if I can give you more info, thank you.