mit-han-lab lmquant issues - Githubissues

mit-han-lab / lmquant

Apache License 2.0

102 stars 5 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

will you support quantize the embedding layer and lm_head layer?

#18 geqian-9192 opened 1 week ago
1
[Bug] RuntimeError: Boolean value of Tensor with more than one value is ambiguous

#17 ChenMnZ opened 1 month ago
2
GPTQ LLAMA 2 7B Question

#16 XiaohanFei opened 1 month ago
0
Why Rotate Again in main.py

#15 RanchiZhao opened 1 month ago
1
MLA supported

#14 RanchiZhao opened 2 months ago
0
Group shape error

#13 LuckyLYM opened 2 months ago
0
Questions about rotation

#12 Kyeong-Joong opened 2 months ago
0
The question about fusing smooth factor for the model used GQA/MQA.

#11 shhn1 opened 2 months ago
0
How to rotate GQA with bias?

#10 mxjmtxrm closed 3 months ago
0
[Minor] Fix typo in installation guide

#9 ys-2020 closed 3 months ago
0
Unable to reproduce RTN results in paper

#8 Golden-Wang opened 3 months ago
0
QoQ-g128 Llama3-70B-Instruct Results

#7 ethxnp opened 3 months ago
0
evaluate kv4 quantization accuracy

#6 SherrySwift closed 3 months ago
2
AssertionError: The smooth scale contains NaN.

#5 ethxnp opened 3 months ago
1
evaluate accuracy

#4 cyLi-Tiger closed 3 months ago
5
The smooth-attention scale of QoQ is not the same as described in the paper

#3 yanghaihui closed 3 months ago
0
Queation about config file

#2 mxjmtxrm closed 3 months ago
16
[major] reorg model config

#1 synxlin closed 4 months ago
0