onnx / tensorflow-onnx

Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX
Apache License 2.0
2.33k stars 432 forks source link

Cannot export SegFormer from huggingface/transformers with TF >= 2.9.0 #2127

Open OutSorcerer opened 1 year ago

OutSorcerer commented 1 year ago

TFSegformerForSemanticSegmentation from huggingface/transformers export to ONNX used to work in TF 2.8.4, the notebook that reproduces a successful export: https://colab.research.google.com/gist/OutSorcerer/c8cd27a455091b57d9ea90ab3450035e/tfsegformer_onnx.ipynb

but its export fails in TF >= 2.9.0, a notebook that reproduces the problem is here: https://colab.research.google.com/gist/OutSorcerer/ebc93cd734ecc0e1dee96c8d20e5e9d5/tfsegformer_onnx.ipynb

The error message is

ERROR:tf2onnx.tfonnx:Tensorflow op [tf_segformer_for_semantic_segmentation/segformer/encoder/block.0.0/mlp/dwconv/dwconv/PartitionedCall: PartitionedCall] is not supported
...
ERROR:tf2onnx.tfonnx:Unsupported ops: Counter({'PartitionedCall': 8})

However, an ONNX file is still produced, which fails at inference time.

---------------------------------------------------------------------------
InvalidGraph                              Traceback (most recent call last)
<ipython-input-32-b73cb7010801> in <module>
----> 1 sess = ort.InferenceSession(onnx_model_path)
      2 ort_outputs = sess.run(None, {"pixel_values": dummy_inputs_numpy})

/usr/local/lib/python3.8/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py in __init__(self, path_or_bytes, sess_options, providers, provider_options, **kwargs)
    358 
    359         try:
--> 360             self._create_inference_session(providers, provider_options, disabled_optimizers)
    361         except ValueError:
    362             if self._enable_fallback:

/usr/local/lib/python3.8/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py in _create_inference_session(self, providers, provider_options, disabled_optimizers)
    395         session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    396         if self._model_path:
--> 397             sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
    398         else:
    399             sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)

InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from mit-b0.onnx failed:This is an invalid model. In Node, ("tf_segformer_for_semantic_segmentation/segformer/encoder/block.0.0/mlp/dwconv/dwconv/PartitionedCall", PartitionedCall, "", -1) : ("tf_segformer_for_semantic_segmentation/segformer/encoder/block.0.0/mlp/dwconv/Reshape:0": tensor(float),"tf_segformer_for_semantic_segmentation/segformer/encoder/block.0.0/mlp/dwconv/dwconv/ReadVariableOp:0": tensor(float),) -> ("tf_segformer_for_semantic_segmentation/segformer/encoder/block.0.0/mlp/dwconv/dwconv/PartitionedCall:0",) , Error No Op registered for PartitionedCall with domain_version of 17

When a model is printed there is a node

node {
    input: "tf_segformer_for_semantic_segmentation/segformer/encoder/block.1.0/mlp/dwconv/Reshape:0"
    input: "tf_segformer_for_semantic_segmentation/segformer/encoder/block.1.0/mlp/dwconv/dwconv/ReadVariableOp:0"
    output: "tf_segformer_for_semantic_segmentation/segformer/encoder/block.1.0/mlp/dwconv/dwconv/PartitionedCall:0"
    name: "tf_segformer_for_semantic_segmentation/segformer/encoder/block.1.0/mlp/dwconv/dwconv/PartitionedCall"
    op_type: "PartitionedCall"
    ...
    attribute {
      name: "f"
      s: "__inference__jit_compiled_convolution_op_6171"
      type: STRING
    }
  }

The issue is that node __inference__jit_compiled_convolution_op_6171 is referenced, but its definition is nowhere to be found, so it is likely tf2onnx has failed to convert __inference__jit_compiled_convolution_op_6171 at the first place.

edumotya commented 1 year ago

The problem are the grouped convolutions

https://github.com/huggingface/transformers/blob/a9eee2ffecc874df7dd635b2c6abb246fdb318cc/src/transformers/models/segformer/modeling_tf_segformer.py#L242-L244

It gets exported without errors at all when using standard convolutions (groups=1).

Related to https://github.com/onnx/tensorflow-onnx/issues/2099

Any idea? workaround?

metalMajor commented 1 year ago

I also have the issue :(

Downgrading to 2.8.4 is not nice. Any solutions found already?