microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.26k stars 2.87k forks source link

Improve Range update for Relu/Clip #21251

Closed f2013519 closed 1 month ago

f2013519 commented 3 months ago

How do we specify fused operator patterns like (conv+relu) in the quantization config? I see such options are available in pytorch but not in onnx static_quantize.

Right now I see different scales at output of conv and relu which is not suitable for us as it will require additional requantize step.

Thanks!

xadupre commented 3 months ago

If you need to fuse operator in a custom way, you can use this tool: https://onnxscript.ai/tutorial/rewriter/rewrite_patterns.html (you should install the development version).

f2013519 commented 2 months ago

i do not necessarily need a custom op, rather a way to specify convolution and relu should not have different scales as this can introduce noise.

although it would be good to have a standard fused op like conv-relu. any reason this is not supported yet?

f2013519 commented 2 months ago

Upon further investigation, the issue is for symmetric quantization only.

For some reason, the current code does not assign output range of clip/relu to the input range. We do not need entire range as it will be discarded after the clip/relu layer.

The current limitation introduces issues for backend which support fused conv-relu + symmetric activations.

This should be fixed by my PR #21573

Pls. review @xadupre @yufenglee and let me know if this makes sense.

Thanks!

f2013519 commented 1 month ago

Closed by #21573