mit-han-lab / smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
https://arxiv.org/abs/2211.10438
MIT License
1.1k stars 127 forks source link

set quantize_output True the acc drop to 0 #72

Open lonleyodd opened 5 months ago

lonleyodd commented 5 months ago

hi, thanks for your nice work, I tried set param quantize_output as True when quantize model,like follow code def from_float(module, weight_quant='per_channel', act_quant='per_token', quantize_output=True), but the acc drop to 0 , is there anything wrong or smoothquant method unsupport quantize output?