Closed nepeee closed 1 year ago
Maybe you can try the latest version. If still cannot compile, remove the half support in the source code because they're not used currently.
what GPU? Sounds like an issue I ran into with GPTQ and an older GPU (M40), there's a fix merged into GPTQ see: https://github.com/qwopqwop200/GPTQ-for-LLaMa/pull/58
So i suggest pulling the latest version on GPTQ-for-LLaMa and try installing again
I tried the new version but got the same error. Then i found this and it fixed my issue: https://github.com/johnsmith0031/alpaca_lora_4bit/issues/9
I got 4.16s/it on a Tesla P40 with the 13B llama.
Thx for the help!
That is like the fix for M40.. we have half precision on 6.1.
I get an error while try to run python setup_cuda.py install from GPTQ-for-LLaMa after copied the modified kernel files: error: no instance of overloaded function "atomicAdd" matches the argument list
I have cuda 11.7 installed and running windows.