Spport for ChatGLM3 - Githubissues

jy-yuan / KIVI

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

https://arxiv.org/abs/2402.02750

MIT License

121 stars 10 forks source link

Open redscv opened 3 weeks ago

redscv commented 3 weeks ago

Great work! What's your suggestion if I would like to test it on ChatGLM3?

jy-yuan commented 2 weeks ago

Thanks for your interest! You can modifiy the modeling_chatglm.py in huggingface similar to our implementation in llama_kivi.py or mistral_kivi.py.

We are planning to extend KIVI to more models and benchmark datasets. Meanwhile, we welcome contributions from the research community.