issues
search
jy-yuan
/
KIVI
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
https://arxiv.org/abs/2402.02750
MIT License
121
stars
10
forks
source link
issues
Oldest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
CUDA version
#16
hensiesp32
closed
3 days ago
4
Why the model inference slowly when Mistral-7B-Instruct-v0.2 apply the kivi?
#15
lichongod
opened
6 days ago
6
Where is the falcon_kivi?
#14
Felixvillas
opened
1 week ago
2
which commit of lm-eval-harness the lmeval branch is based on?
#13
condy0919
closed
3 days ago
3
An error occurred while using "evaluate. load (" act_match ")"
#12
Felixvillas
closed
5 days ago
1
Which file I need to run to obtain the result in Figure 4?
#11
Felixvillas
closed
5 days ago
2
not support evaluation with ROCM
#10
ym-guan
opened
3 weeks ago
1
Spport for ChatGLM3
#9
redscv
opened
3 weeks ago
1
Provide an accuracy testing interface?
#8
ascendpoet
closed
2 weeks ago
1
Discrepancy in Reproduced Results for LLaMA2 on "qmsum" and "qasper" tasks.
#7
ilur98
closed
1 month ago
2
W/ or w/o Weight quantization?
#6
deephanson94
closed
2 weeks ago
4
[fix] add the missing comma in pyproject.toml to enable correct pip i…
#5
wln20
closed
1 month ago
1
Integrate KIVI into inference frameworks?
#4
andakai
closed
2 weeks ago
1
LlamaConfig.attention_dropout does not exist in transformers==4.35.2
#3
RalphMao
closed
1 month ago
1
Could you please open-source the code for the calculation and visualization of the statistic information of KV Cache?
#2
wln20
closed
2 months ago
3
Can this be used with any autogressive model?
#1
hello-fri-end
closed
1 month ago
1