issues
search
Aaronhuang-778
/
BiLLM
(ICML 2024) BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
https://arxiv.org/abs/2402.04291
MIT License
154
stars
12
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
scale factor and bit storing calculation
#15
kaizizzzzzz
opened
3 weeks ago
1
Issue with code replication
#14
Devy99
opened
3 weeks ago
2
Question about 1-bit compression with combined binary masks
#13
pprp
opened
1 month ago
0
Any Plan for Multi-GPUs Support?
#12
pprp
opened
2 months ago
0
question about weight storage when infer
#11
DamonsJ
opened
2 months ago
0
Looking forward to supporting Mixtral_8x7b MoE
#10
Gierry
opened
3 months ago
1
inference
#9
shyget
opened
3 months ago
0
fix requirements.txt
#8
earsaxcs
closed
4 months ago
0
Inference
#7
diff7
opened
4 months ago
0
I have a question about the paper.
#6
shampooooo
closed
4 months ago
3
Possible Error in the Paper
#5
shampooooo
closed
4 months ago
0
Do you quantize the LM head, embedding, and layernorms or just the weights?
#4
tsengalb99
opened
4 months ago
0
Update README.md
#3
eltociear
closed
4 months ago
1
Model weight access?
#2
BarfingLemurs
closed
4 months ago
1
Request: please consider evaluating pareto-optimality of BiLLM
#1
justheuristic
opened
4 months ago
3