Pot scale 的伪量化模型和部署模型精度差异很大

ModelTC / MQBench

Model Quantization Benchmark

Apache License 2.0

769 stars 140 forks source link

Pot scale 的伪量化模型和部署模型精度差异很大 #235

Closed wangshankun closed 1 year ago

wangshankun commented 1 year ago

pth模型用pytorch runtime 推理top1@结果是79% 而onnx_qnn模型的量化版本 onnx runtime 推理top1@是65%

当后量化时候把Pot scale设置为false，导出的部署模型onnx_qnn模型用 onnx runtime 推理结果和pytorch 就很接近了

Tracin commented 1 year ago

是否开启Pot scale取决于具体部署硬件

wangshankun commented 1 year ago

是否开启Pot scale取决于具体部署硬件

我的硬件仿真器验证的，硬件是支持Pot scale，只是onnxruntime用cpu运行的也不应该出现这么大的误差吧

Tracin commented 1 year ago

需要补充一些详细的实验设置

wangshankun commented 1 year ago

不好意思，公司禁用了云盘，分享不了模型；就是最简单rensen50模型，用onnx_qnn后端，只是开了Pot scale

    BackendType.ONNX_QNN:   dict(qtype='affine',     # noqa: E241
                                 w_qscheme=QuantizeScheme(symmetry=True, per_channel=False, pot_scale=True, bit=8),
                                 a_qscheme=QuantizeScheme(symmetry=True, per_channel=False, pot_scale=True, bit=8),
                                 default_weight_quantize=LearnableFakeQuantize,
                                 default_act_quantize=LearnableFakeQuantize,
                                 default_weight_observer=MinMaxObserver,
                                 default_act_observer=MinMaxObserver),

wangshankun commented 1 year ago

需要补充一些详细的实验设置

开 pot scale的 ptq效果不行，QAT还勉强可以，而且还得开perchannel才行

github-actions[bot] commented 1 year ago

This issue has not received any updates in 120 days. Please reply to this issue if this still unresolved!