issues
search
AniZpZ
/
AutoSmoothQuant
An easy-to-use package for implementing SmoothQuant for LLMs
MIT License
82
stars
7
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Can we use W8A8B8O8Linear in LLaMA model?
#26
peilin-chen
opened
4 weeks ago
0
smoothquant fp8 support
#25
huangtingwei9988
opened
2 months ago
0
llama2-7b-chat量化完推理报错
#24
AlexMa0
opened
3 months ago
5
是否支持int4量化
#23
AlexMa0
opened
4 months ago
1
Smoothquant for Phi2 and Qwen2
#22
ponytaill
opened
5 months ago
1
Smoothquant for Phi2
#21
ponytaill
closed
5 months ago
5
When quantizing baichuan2-7B, I encountered an error: AttributeError: 'RMSNorm' object has no attribute 'epsilon'.
#20
Flyipig
opened
6 months ago
0
baichuan2 inference with vllm
#19
AGI-player
closed
6 months ago
2
add model evaluation
#18
AniZpZ
closed
7 months ago
0
Maybe some problem you need to take care
#17
DonliFly
closed
7 months ago
1
name 'position_ids' is not defined
#16
LMX-xin
opened
7 months ago
1
libcudart.so.11.0 error
#15
huangtingwei9988
closed
5 months ago
1
Tensor shape error when loading quant Llama-2-70B
#14
MingLin-home
opened
8 months ago
3
Any plan for supporting Mistral 7B model?
#13
leocnj
opened
8 months ago
1
add smooth strength param
#12
AniZpZ
closed
8 months ago
0
Cannot quantize to int8 - torch TypeError
#11
AlpinDale
opened
9 months ago
2
Difference between W8A8BFP32OFP32LinearWithQuantScale and W8A8BFP32OFP32Linear
#10
Hongbosherlock
opened
9 months ago
2
Question about per-token quant
#9
Hongbosherlock
opened
9 months ago
2
Update README for inference
#8
AniZpZ
closed
9 months ago
0
Support Mixtral and Baichuan 7B
#7
AniZpZ
closed
9 months ago
0
使用该库是否还需要安装torch-int?
#6
xyfZzz
closed
9 months ago
2
校准集的格式是什么?
#5
B-201
closed
9 months ago
1
量化后推理速度
#4
jundolc
opened
9 months ago
7
Update README.md
#3
AniZpZ
closed
9 months ago
0
fix llama2 bug
#2
HandH1998
closed
9 months ago
0
Add examples
#1
AniZpZ
closed
9 months ago
0