Question about quantization of batch normalization

Kentaro-Mikami commented 2 years ago

Hello, Onnx runtime development team. Let us ask the question about quantization of batch normalization. We use onnx runtime 1.9.0, static quantization.

If we use "Network A", Batch Normalization Layer was fused to Convolution layer before quantization. And, we could quantize fused convolution layer. [Network A] Convolution - Batch Normalization - Relu

But if we use "Network B", Batch Normalization Layer was not fused to Convolution layer before quantization. As a result of quantization, the parameters of batch normalization was not qyuantized. (We confirmed FP32 parameters of Batch Normalization after quantization.) [Network B] Convolution - Relu - Batch Normalization

Do you have any TIPS to quantize Batch Normalization layer? Or, do you have plan to support quantization of Batch Normalization Layer?

Thank you for your support.

yufenglee commented 2 years ago

@Kentaro-Mikami, BN usually comes before Relu. Any reason to do activation before BN? As for quantization of BN, it is essentially a requantization. We will add it to our list to support. You're very welcome to contribute, too.

Kentaro-Mikami commented 2 years ago

@yufenglee , Thank you for your answer. Yes, we also think that we usually use BN before Relu. But we sometimes find that some networks use BN not only after convolution layer but also other places. (e.g.resnet v2 ) And thank you very much to add BN quantization to support list.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

giamic commented 2 years ago

Hi all, any news on the quantisation of batch norm?

HongzhengYang commented 3 months ago

Hi all, any news on the quantization of batch norm?

microsoft / onnxruntime

Question about quantization of batch normalization #9938