issues
search
mit-han-lab
/
llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT License
2.08k
stars
150
forks
source link
how to support to custom module like mla in deep-seek-v2
#187
Open
robert-lee2016
opened
1 month ago