Open Mypathissional opened 2 years ago
The fusion logic of Conv
and BatchNormalization
do in back-to-back optimizer. Could you check if your model conversion process is passed through the below code? https://github.com/onnx/tensorflow-onnx/blob/c67bcfb580be741ece8d9978a9b57bd2ce7367ee/tf2onnx/optimizer/back_to_back_optimizer.py#L191
@hwangdeyu After i turned off the back-to-back optimizer in the beginning of init file, I am printing a message both at the beginning of the _optimize_conv_batchnorm_fusion(g, node, consumer_nodes) and in optimize_graph(graph, catch_errors=True, optimizers=None) in optimizer init file. It is entering optimize_graph but not _optimize_conv_batchnorm_fusion and some kind of fusion is still happening because the node name is changed.
i guess the problem might be not in the fusing but in the type of the batchnormalization layer used which is SyncBatchNormalization. I have prepared a minimal example for it. For this code the exported model does not have the batchnorms
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
net = tf.keras.Sequential()
net.add(tf.keras.layers.Conv2D(2,3))
net.add(tf.keras.layers.experimental.SyncBatchNormalization())
net.build((1,30,30,2))
net.save("~/Desktop/conv_block")
Can this be a problem?
Hi @Mypathissional , I think it's a expected behavior for tensorflow-onnx.
Cause when I do the convert script, there is no BatchNormalization
op, even before running the optimizer conversion.
optimizer before: Counter({'Identity': 7, 'Const': 2, 'Transpose': 2, 'Placeholder': 1, 'Conv': 1, 'Mul': 1})
optimizer after: Counter({'Transpose': 2, 'Placeholder': 1, 'Const': 1, 'Conv': 1})
However, if we change tf.keras.layers.experimental.SyncBatchNormalization()
to tf.keras.layers.BatchNormalization()
, the op would be shown.
optimizer before: Counter({'Identity': 6, 'Const': 5, 'Transpose': 4, 'Placeholder': 1, 'Conv': 1, 'BatchNormalization': 1})
@hwangdeyu Can you tell just for my understanding what happens when the operation that is present in the saved model but not present in the onnx operations? Is this operation just getting skipped?
@hwangdeyu Deyu Huang FTE Can you tell just for my understanding what happens when the operation that is present in the saved model but not present in the onnx operations? Is this operation just getting skipped?
I don't know how experimental.SyncBatchNormalization()
works with deep implementation.
From what I've seen so far, the op is not presented in save model neither.
There is a FusedBatchNormV3
in tf.keras.layers.BatchNormalization()
saved model ops:
['Placeholder', 'Const', 'Const', 'Const', 'Const', 'Const', 'Const', 'Identity', 'NoOp', 'NoOp', 'Conv2D', 'NoOp', 'Identity', 'FusedBatchNormV3', 'Identity', 'Identity', 'Identity']
But it's missing this op in tf.keras.layers.experimental.SyncBatchNormalization()
saved model ops:
['Placeholder', 'Const', 'Const', 'Const', 'Const', 'Const', 'Const', 'Const', 'Const', 'Identity', 'NoOp', 'NoOp', 'Conv2D', 'NoOp', 'Identity', 'Mul', 'Identity', 'Identity', 'Identity', 'Identity']
Describe the bug Hi, I was converting CenterNet(CenterNet HourGlass104 512x512 from Tensorflow Object Detection API(https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md) with the Back-to-back optimizer turned off to disable the batchnorm fusion into conv layers following https://github.com/onnx/tensorflow-onnx/issues/1702 . The problem is that even though the back-to-back optimizer is turned off the convolutions and batchnorms are still fused together. Where else optimization can occur? Using tensorflow=2.8.0, onnx=1.11.0, tf2onnx=1.9.3/1190aa and opset 15