jy-yuan KIVI issues - Githubissues

jy-yuan / KIVI

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

https://arxiv.org/abs/2402.02750

MIT License

121 stars 10 forks source link

issues

Oldest

Newest Most commented Recently updated Oldest Least commented Least recently updated

CUDA version

#16 hensiesp32 closed 3 days ago
4
Why the model inference slowly when Mistral-7B-Instruct-v0.2 apply the kivi?

#15 lichongod opened 6 days ago
6
Where is the falcon_kivi?

#14 Felixvillas opened 1 week ago
2
which commit of lm-eval-harness the lmeval branch is based on?

#13 condy0919 closed 3 days ago
3
An error occurred while using "evaluate. load (" act_match ")"

#12 Felixvillas closed 5 days ago
1
Which file I need to run to obtain the result in Figure 4？

#11 Felixvillas closed 5 days ago
2
not support evaluation with ROCM

#10 ym-guan opened 3 weeks ago
1
Spport for ChatGLM3

#9 redscv opened 3 weeks ago
1
Provide an accuracy testing interface?

#8 ascendpoet closed 2 weeks ago
1
Discrepancy in Reproduced Results for LLaMA2 on "qmsum" and "qasper" tasks.

#7 ilur98 closed 1 month ago
2
W/ or w/o Weight quantization?

#6 deephanson94 closed 2 weeks ago
4
[fix] add the missing comma in pyproject.toml to enable correct pip i…

#5 wln20 closed 1 month ago
1
Integrate KIVI into inference frameworks?

#4 andakai closed 2 weeks ago
1
LlamaConfig.attention_dropout does not exist in transformers==4.35.2

#3 RalphMao closed 1 month ago
1
Could you please open-source the code for the calculation and visualization of the statistic information of KV Cache?

#2 wln20 closed 2 months ago
3
Can this be used with any autogressive model?

#1 hello-fri-end closed 1 month ago
1