mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT License
2.5k stars 192 forks source link

Add support for INT3 quantization #115

Open crisb-7 opened 12 months ago

crisb-7 commented 12 months ago

Hello. Thank you for the amazing work.

Is there a possibility or interest to add support for quantizing models in INT3 in the near future? It would be interesting to quantize and test models with INT3 to compare inference speed vs quality vs the already implemented INT4 models.

Thanks in advance. :)

andrew89982018 commented 12 months ago

I NEED INT3 TOO!

andrew89982018 commented 11 months ago

is the fucking team dead really? no response to the issue?