IST-DASLab / gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
https://arxiv.org/abs/2210.17323
Apache License 2.0
1.81k stars 145 forks source link

quant_cuda_kernel.cu(212): error: identifier "__hfma2" is undefined #23

Open HueCheng1021 opened 1 year ago

HueCheng1021 commented 1 year ago

an error is reported when compiling the quant_cuda kernel.

in my case, Cuda compilation tools, release 12.0, V12.0.140

efrantar commented 1 year ago

Our kernels were developed with CUDA 11.4. However, this function still seems to exist in the newest CUDA API, so I am unfortunately not sure what's causing the error. If you don't need our fastest FP16 kernels (e.g. if you aren't on an A100 for which they were actually developed), you could perhaps try commenting out the corresponding code in quant_cuda.cpp and quant_cuda_kernel.cu and using the FP32 version (omitting the option --faster-kernel).