mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT License
2.07k stars 150 forks source link

can awq support 3-bit,2-bit, 8-bit quantization? #172

Open ArlanCooper opened 2 months ago

ArlanCooper commented 2 months ago

i see now awq only support 4-bit quantization, can it supports 2-bit,3-bit, 8-bit quantization?

GilesBathgate commented 2 months ago

Efficient and accurate low-bit weight quantization (INT3/4) for LLMs, supporting instruction-tuned models and multi-modal LMs.

:thinking: