johnsmith0031 / alpaca_lora_4bit

MIT License
534 stars 84 forks source link

The atomic add doesn't work on compute 6.1 #9

Closed Ph0rk0z closed 1 year ago

Ph0rk0z commented 1 year ago

I get error when I try to compile.

home/mint/text-generation-webui/repositories/GPTQ-for-LLaMa/quant_cuda_kernel.cu(548): error: no instance of overloaded function "atomicAdd" matches the argument list
            argument types are: (__half *, __half)

/home/mint/text-generation-webui/repositories/GPTQ-for-LLaMa/quant_cuda_kernel.cu(621): error: no instance of overloaded function "atomicAdd" matches the argument list
            argument types are: (__half *, __half)

Found the issue it came from but it seems like it's not using this function from where you defined it at the top.

Ph0rk0z commented 1 year ago

If I rename the function I get:

error: argument of type "__half *" is incompatible with parameter of type "c10::Half *"

and now it built: https://pastebin.com/aykdsJJT

Ph0rk0z commented 1 year ago

my dummy patch has worked.

Loading Model ...
Converted as Half.
Loaded the model in 4.71 seconds.
Fitting 4bit scales and zeros to half
 I think the meaning of life is to have a nice time, and that’s what we did on tour.
In your opinion
1.5773606300354004
gururise commented 1 year ago

my dummy patch has worked.

Loading Model ...
Converted as Half.
Loaded the model in 4.71 seconds.
Fitting 4bit scales and zeros to half
 I think the meaning of life is to have a nice time, and that’s what we did on tour.
In your opinion
1.5773606300354004

Nice! If it tests fine, are you planning on submitting a PR?

Ph0rk0z commented 1 year ago

I could. Should it only be for cuda <7.0?

For <6.0 it needs the double one from here

I really want to make a fully merged GPTQ with this and GPT-J/GPT-X handling.

Ph0rk0z commented 1 year ago

https://github.com/johnsmith0031/alpaca_lora_4bit/pull/14 so I made one for better or worse.