jllllll / llama-cpp-python-cuBLAS-wheels

Wheels for llama-cpp-python compiled with cuBLAS support
The Unlicense
94 stars 41 forks source link

llama-cpp-python-cuda wheel #1

Closed x90skysn3k closed 1 year ago

x90skysn3k commented 1 year ago

Is it possible to get a llama-cpp-python-cuda wheel compiled with AVX2=OFF, F16C=OFF, FMA=OFF? I'm running on an E5-2697v2 with text-generation-webui and i'm getting illegal instructions because my CPU does not support AVX2 or FMA.

jllllll commented 1 year ago

Yes. I was planning on adding this sometime today.

x90skysn3k commented 1 year ago

that would be awesome, been having illegal instructions due to my older processor.

x90skysn3k commented 1 year ago

it'd be great to have them for webui under the different namespace of llama_cpp_cuda with cuBLAS support with the AVX2=OFF, F16C=OFF, FMA=OFF.

jllllll commented 1 year ago

Wheels have been built for CPU-only packages and cuBLAS packages:

python -m pip install llama-cpp-python llama-cpp-python-cuda --force-reinstall --no-deps --index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/basic/cpu --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/textgen/basic/cu117

Let me know if there are any issues.