mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT License
2.38k stars 184 forks source link

Fix illegal memory access of GEMV kernel #201

Open xutianming opened 3 months ago

xutianming commented 3 months ago

Dynamic shared memory of GEMV kernel is not allocated when calling GEMV kernel which causes Illegal Memory Access error.

This pull request fixes above issue by specifying shared memory size when calling GEMV kernel