mit-han-lab / smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
https://arxiv.org/abs/2211.10438
MIT License
1.26k stars 150 forks source link

Inquiry about Int8 BMM overflow #84

Open luzai opened 7 months ago

luzai commented 7 months ago

Thank you for your great work and elegant idea! Just wondering what if Int8 BMM overflows? Will it do wrapping or saturation?