ModelTC / MQBench

Model Quantization Benchmark
Apache License 2.0
769 stars 140 forks source link

Pot scale 的伪量化模型和部署模型精度差异很大 #235

Closed wangshankun closed 1 year ago

wangshankun commented 1 year ago

pth模型用pytorch runtime 推理top1@结果是79% 而onnx_qnn模型的量化版本 onnx runtime 推理top1@是65%

当后量化时候把Pot scale设置为false,导出的部署模型onnx_qnn模型用 onnx runtime 推理结果和pytorch 就很接近了

Tracin commented 1 year ago

是否开启Pot scale取决于具体部署硬件

wangshankun commented 1 year ago

是否开启Pot scale取决于具体部署硬件

我的硬件仿真器验证的,硬件是支持Pot scale,只是onnxruntime用cpu运行的也不应该出现这么大的误差吧

Tracin commented 1 year ago

需要补充一些详细的实验设置

wangshankun commented 1 year ago

不好意思,公司禁用了云盘,分享不了模型; 就是最简单rensen50模型,用onnx_qnn后端,只是开了Pot scale

    BackendType.ONNX_QNN:   dict(qtype='affine',     # noqa: E241
                                 w_qscheme=QuantizeScheme(symmetry=True, per_channel=False, pot_scale=True, bit=8),
                                 a_qscheme=QuantizeScheme(symmetry=True, per_channel=False, pot_scale=True, bit=8),
                                 default_weight_quantize=LearnableFakeQuantize,
                                 default_act_quantize=LearnableFakeQuantize,
                                 default_weight_observer=MinMaxObserver,
                                 default_act_observer=MinMaxObserver),
wangshankun commented 1 year ago

需要补充一些详细的实验设置

开 pot scale的 ptq效果不行,QAT还勉强可以,而且还得开perchannel才行

github-actions[bot] commented 1 year ago

This issue has not received any updates in 120 days. Please reply to this issue if this still unresolved!