turboderp / exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
MIT License
2.74k stars 215 forks source link

doesn't use CUDA_HOME? #293

Open j2l opened 1 year ago

j2l commented 1 year ago

Hello, python test_benchmark_inference.py -d ./models -p -ppl throws:

/bin/sh: 1: /usr/bin/nvcc: not found
ninja: build stopped: subcommand failed.

but nvcc (Build cuda_11.8.r11.8) is installed

which nvcc
/usr/local/cuda/bin/nvcc

~/.bashrc

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
export CUDA_HOME=/usr
export PATH="/usr/local/cuda/bin:$PATH"

Other AIs use it (Stable Diffusion, ...), so what is wrong?

Ubuntu 22.04 (PopOS) Nvidia 535 for RTX3060 Python 3.10.12