Open HueCheng1021 opened 1 year ago
Our kernels were developed with CUDA 11.4. However, this function still seems to exist in the newest CUDA API, so I am unfortunately not sure what's causing the error. If you don't need our fastest FP16 kernels (e.g. if you aren't on an A100 for which they were actually developed), you could perhaps try commenting out the corresponding code in quant_cuda.cpp
and quant_cuda_kernel.cu
and using the FP32 version (omitting the option --faster-kernel
).
an error is reported when compiling the quant_cuda kernel.
in my case, Cuda compilation tools, release 12.0, V12.0.140