jy-yuan / KIVI

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
https://arxiv.org/abs/2402.02750
MIT License
121 stars 10 forks source link

Spport for ChatGLM3 #9

Open redscv opened 3 weeks ago

redscv commented 3 weeks ago

Great work! What's your suggestion if I would like to test it on ChatGLM3?

jy-yuan commented 2 weeks ago

Thanks for your interest! You can modifiy the modeling_chatglm.py in huggingface similar to our implementation in llama_kivi.py or mistral_kivi.py.

We are planning to extend KIVI to more models and benchmark datasets. Meanwhile, we welcome contributions from the research community.