issues
search
CryVeck
/
QuaRot
Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.
https://arxiv.org/abs/2404.00456
Apache License 2.0
0
stars
0
forks
source link
fixing token_wise rotation size when different from value size
#3
Closed
CryVeck
closed
1 day ago
CryVeck
commented
1 day ago
Using the right value of hidden size for the k projection.
Using the right value of hidden size for the k projection.