casper-hansen / AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
https://casper-hansen.github.io/AutoAWQ/
MIT License
1.71k stars 204 forks source link

CUDA error: no kernel image is available for execution on the device #557

Open AragornHorse opened 2 months ago

AragornHorse commented 2 months ago

I encountered the following error while using the quantized Qwen-72B

out = awq_ext.gemm_forward_cuda(
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

How to solve it?

casper-hansen commented 2 months ago

Hi @AragornHorse, I will need more details about your environment and hardware to determine what the issue is.

zlwzlwzlw commented 1 month ago

hi, i have same question and i run Qwen2-7B-AWQ

hardware: ubunun22.04 v100 gpu

environment: autoawq=0.2.6 torch==2.3.1

Ed3ward commented 1 month ago

我在运行codellama-7b-AWQ时遇到了同样的问题 硬件: v100 环境: awq==0.2.6 torch==2.4.0