microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.17k stars 2.86k forks source link

Support for ConvTranspose quantization? #8988

Open mikeseven opened 3 years ago

mikeseven commented 3 years ago

It would seem ConvTranspose is missing in quantized operators, in quantization tool as well as runtime.

Is it scheduled to be added? when?

Thanks

yufenglee commented 3 years ago

@mikeseven, we don't have quantization support of ConvTranspose. What's arch of your model with ConvTranspose?

stale[bot] commented 2 years ago

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

ulfhanebutte commented 1 year ago

@yufenglee , as the use of ConvTranspose is becoming more widely used as a way to "upsample" in image processing (think about hourglass type networks with a resnet backbone) the operator should be supported in the quantization flow. We tool the latest onnxruntime version 1.13.1 and run our small unit test (shown below as netron visualization) and we see clearly that ConvTrans keeps its fp32 weights and operation is not quantized. From the quantization perspective, the convTranspose should not be different from a regular Conv and the code changes to enable ConvTranspose should be reasonably small. We also noticed that Batchnormalization is not fused with ConvTranspose, which is contrary to Conv case where is it being fused during the optimization step. We will open a separate issue about this. here is the original onnx graph image and here is the qdq instrumented one image

a symmetric int8 quantization was performed to get to the qdq onnx file.

ulfhanebutte commented 1 year ago

for batchnorm the following issue has been opened https://github.com/microsoft/onnxruntime/issues/14270

ulfhanebutte commented 1 year ago

convtrans_reproducer.zip convtrans_reproducer_part2.zip

the above zip files (had to split as due to the 25M limit) contains the original onnx file, the script used to generate the QDQ file and the resulting QdQ file

ulfhanebutte commented 1 year ago

Hi @yufenglee, please let us know if you need any additional material to enable the quantization of the ConvTranspose. Thanks

Kentaro-Mikami commented 11 months ago

Hi @yufenglee, Is there any update information regarding ConvTranspose quantization support? We also want to use ConvTranspose quantization. Thanks.

ulfhanebutte commented 10 months ago

Hi @chilo-ms and @yufenglee, can you add the label feature request to this item and give an update on when the community can expect to have this supported. - THANKS