Closed winglian closed 1 year ago
Yes, quant_cuda extension is needed. you should run pip install -r requirements.txt to install it. If it cannot be installed, first run pip uninstall gptq_llama and reinstall it.
looks like an explicit import is needed. https://github.com/johnsmith0031/alpaca_lora_4bit/pull/34
https://github.com/johnsmith0031/alpaca_lora_4bit/blob/234004ceb5135e092bc9a08a9dbb75eff61f8fd9/matmul_utils_4bit.py#L3
looks like this addition requires some additional documentation or setup to get quant_cuda as this wasn't required before.