issues
search
whyNLP
/
LCKV
Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance. Accepted to ACL 2024.
https://arxiv.org/abs/2405.10637
139
stars
6
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Question about supporting other settings of cross-layer KV cache sharing
#12
ChenHong30
opened
1 week ago
3
Guidance on Fine-Tuning Llama in LCKV Framework
#11
Mostafa-Emad77
opened
1 week ago
7
What is the specific version of the Python this project running on?
#10
ChenHong30
closed
1 week ago
14
SDPA not implemented error
#9
SpoSer23
closed
2 weeks ago
9
Merge codes for better ways of initializing weights
#8
why-in-Shanghaitech
closed
1 month ago
0
Merge the codes for grouping
#7
why-in-Shanghaitech
closed
1 month ago
0
Update new work
#6
why-in-Shanghaitech
closed
1 month ago
0
Inquiry Regarding Generated Model Text Output
#5
alvi75
closed
4 months ago
1
Support gradient checkpointing for lckv
#4
why-in-Shanghaitech
opened
5 months ago
0
Question about gradient checkpointing?
#3
311dada
closed
5 months ago
2
Support CLA
#2
why-in-Shanghaitech
closed
6 months ago
0
feat: support dialog attention
#1
why-in-Shanghaitech
closed
6 months ago
0