OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
MIT License
672 stars 52 forks source link

[New Feature] Seek MLA Supported by Smooth #86

Open RanchiZhao opened 2 months ago

RanchiZhao commented 2 months ago

Will MLA that used in DeepSeek-V2 (https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat and https://arxiv.org/abs/2405.04434) be supported by activation smooth method?