AniZpZ AutoSmoothQuant issues

AniZpZ / AutoSmoothQuant

An easy-to-use package for implementing SmoothQuant for LLMs

MIT License

82 stars 7 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Can we use W8A8B8O8Linear in LLaMA model?

#26 peilin-chen opened 4 weeks ago
0
smoothquant fp8 support

#25 huangtingwei9988 opened 2 months ago
0
llama2-7b-chat量化完推理报错

#24 AlexMa0 opened 3 months ago
5
是否支持int4量化

#23 AlexMa0 opened 4 months ago
1
Smoothquant for Phi2 and Qwen2

#22 ponytaill opened 5 months ago
1
Smoothquant for Phi2

#21 ponytaill closed 5 months ago
5
When quantizing baichuan2-7B, I encountered an error: AttributeError: 'RMSNorm' object has no attribute 'epsilon'.

#20 Flyipig opened 6 months ago
0
baichuan2 inference with vllm

#19 AGI-player closed 6 months ago
2
add model evaluation

#18 AniZpZ closed 7 months ago
0
Maybe some problem you need to take care

#17 DonliFly closed 7 months ago
1
name 'position_ids' is not defined

#16 LMX-xin opened 7 months ago
1
libcudart.so.11.0 error

#15 huangtingwei9988 closed 5 months ago
1
Tensor shape error when loading quant Llama-2-70B

#14 MingLin-home opened 8 months ago
3
Any plan for supporting Mistral 7B model?

#13 leocnj opened 8 months ago
1
add smooth strength param

#12 AniZpZ closed 8 months ago
0
Cannot quantize to int8 - torch TypeError

#11 AlpinDale opened 9 months ago
2
Difference between W8A8BFP32OFP32LinearWithQuantScale and W8A8BFP32OFP32Linear

#10 Hongbosherlock opened 9 months ago
2
Question about per-token quant

#9 Hongbosherlock opened 9 months ago
2
Update README for inference

#8 AniZpZ closed 9 months ago
0
Support Mixtral and Baichuan 7B

#7 AniZpZ closed 9 months ago
0
使用该库是否还需要安装torch-int？

#6 xyfZzz closed 9 months ago
2
校准集的格式是什么？

#5 B-201 closed 9 months ago
1
量化后推理速度

#4 jundolc opened 9 months ago
7
Update README.md

#3 AniZpZ closed 9 months ago
0
fix llama2 bug

#2 HandH1998 closed 9 months ago
0
Add examples

#1 AniZpZ closed 9 months ago
0