jy-yuan / KIVI

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
https://arxiv.org/abs/2402.02750
MIT License
121 stars 10 forks source link

Which file I need to run to obtain the result in Figure 4? #11

Closed Felixvillas closed 5 days ago

Felixvillas commented 3 weeks ago

I want to test the memory usage andt hroughput comparison between 2bit KIVI and 16 bit baseline. How can I test them?

jy-yuan commented 2 weeks ago

Please refer to mem_spd_test.py

Felixvillas commented 2 weeks ago

Please refer to mem_spd_test.py

thank you, I'll try it!