AIoT-MLSys-Lab / Efficient-LLMs-Survey

[TMLR 2024] Efficient Large Language Models: A Survey
https://arxiv.org/abs/2312.03863
1.02k stars 85 forks source link

Add ICLR24 spotlight paper OmniQuant. #31

Closed ChenMnZ closed 7 months ago

ChenMnZ commented 7 months ago

OmniQuant is a PTQ method that supports both weight-only quantization and weight-activation quantization.