huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.64k stars 27.16k forks source link

ONNX export fails for TFSegformerForSemanticSegmentation #21642

Closed OutSorcerer closed 1 year ago

OutSorcerer commented 1 year ago

System Info

Additionally, the versions of some relevant packages are

transformers @ git+https://github.com/huggingface/transformers@762dda44deed29baab049aac5324b49f134e7536
onnx==1.13.0
onnxruntime==1.14.0
tf2onnx==1.13.0

Who can help?

@gante, @Rocketknight1

Information

Tasks

Reproduction

  1. Run this notebook (https://github.com/deep-diver/segformer-tf-transformers/blob/main/notebooks/TFSegFormer_ONNX.ipynb) in Colab. ONNX export apparently worked there as of July 25 2022, but it fails now.

    The error message is

ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.0.0/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.0.1/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.0.2/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.1.0/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.1.1/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.1.2/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.1.3/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.1.4/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.1.5/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.0/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.1/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.2/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.3/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.4/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.5/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.6/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.7/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.8/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.9/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.10/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.11/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.12/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.13/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.14/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.15/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.16/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.17/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.18/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.19/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.20/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.21/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.22/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.23/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.24/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.25/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.26/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.27/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.28/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.29/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.30/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.31/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.32/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.33/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.34/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.35/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.36/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.37/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.38/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.2.39/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.3.0/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.3.1/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.3.2/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
ERROR:tf2onnx.tfonnx:Unsupported ops: Counter({'PartitionedCall': 52})

Expected behavior

ONNX export of TFSegformerForSemanticSegmentation works.

Rocketknight1 commented 1 year ago

cc @sayakpaul, any idea what the cause might be?

sayakpaul commented 1 year ago

Maybe this issue is better suited for https://github.com/onnx/tensorflow-onnx

Could you try downgrading the tf2onnx version to 1.11.1?

OutSorcerer commented 1 year ago

@sayakpaul

Could you try downgrading the tf2onnx version to 1.11.1?

I tried this, unfortunately there is still an error that PartitionedCall is not supported.

I also tried to downgrade tensorflow and ONNX export worked with tensorflow==2.8.4, here is an example: https://colab.research.google.com/gist/OutSorcerer/c8cd27a455091b57d9ea90ab3450035e/tfsegformer_onnx.ipynb

Maybe this issue is better suited for https://github.com/onnx/tensorflow-onnx

There are already issues there about PartitionedCall support e.g. https://github.com/onnx/tensorflow-onnx/issues/1864.

However, since export works with a previous version of TensorFlow, it seems that PartitionedCall operation is not essential for a model to work. This is a low-level operation automatically added by TensorFlow and another workaround with new versions of TensorFlow could be to disable its insertion into an operation graph, but I was not able to quickly find a way to do it.

Also, regardless of the error message printed an ONNX file is still generated (which obviously fails at inference time), so yet another workaround could be to remove PartitionedCalls from an ONNX file.

sayakpaul commented 1 year ago

Thanks for investigating. With your workaround, does the model work during inference as expected?

If so, I guess we can safely close the issue here?

OutSorcerer commented 1 year ago

Thanks for investigating. With your workaround, does the model work during inference as expected?

Yes, I rerun the cells that were comparing outputs of a TF model and an ONNX model and the outputs match.

If so, I guess we can safely close the issue here?

Well, from my perspective ideally one of workarounds would be applied in transformers and TFSegformerForSemanticSegmentation would work with the most recent releases of TF and other packages, but I also understand that eventually tf2onnx developers should do something with PartitionedCall export and this issue would be solved too.

OutSorcerer commented 1 year ago

In fact, PartitionedCall may not be the root cause of the problem.

I looked at the ONNX file produced with TF 2.11.0 in the notebook above by doing

onnx_model = onnx.load(onnx_model_path)
with open("model.txt", "w") as f:
  f.write(str(onnx_model))

It has the following node

node {
    input: "tf_segformer_for_semantic_segmentation/segformer/encoder/block.1.0/mlp/dwconv/Reshape:0"
    input: "tf_segformer_for_semantic_segmentation/segformer/encoder/block.1.0/mlp/dwconv/dwconv/ReadVariableOp:0"
    output: "tf_segformer_for_semantic_segmentation/segformer/encoder/block.1.0/mlp/dwconv/dwconv/PartitionedCall:0"
    name: "tf_segformer_for_semantic_segmentation/segformer/encoder/block.1.0/mlp/dwconv/dwconv/PartitionedCall"
    op_type: "PartitionedCall"
    ...
    attribute {
      name: "f"
      s: "__inference__jit_compiled_convolution_op_6171"
      type: STRING
    }
  }

The issue is that node __inference__jit_compiled_convolution_op_6171 is referenced, but its definition is nowhere to be found. So likely tf2onnx failed to convert that operation at the first place.

There was a similar issue, where one of tf2onnx contributors said:

StatefulPartitionedCall is an op that does a simple function call in TF. Our converter doesn't normally have to deal with it since the optimizer we run before conversion automatically inlines most function calls. If it shows up in the optimized graph there is usually some reason that will prevent conversion from working.

I created an issue with the details above in tf2onnx GitHub: https://github.com/onnx/tensorflow-onnx/issues/2127

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.