onnx / tensorflow-onnx

Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX
Apache License 2.0
2.32k stars 433 forks source link

Unsupported op ReverseV2 #530

Closed satyajithj closed 5 years ago

satyajithj commented 5 years ago

Converting a graph to onnx results in

Traceback (most recent call last):
  File "/usr/lib64/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib64/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/fuzzybatman/.local/lib/python3.7/site-packages/tf2onnx/convert.py", line 145, in <module>
    main()
  File "/home/fuzzybatman/.local/lib/python3.7/site-packages/tf2onnx/convert.py", line 127, in main
    inputs_as_nchw=args.inputs_as_nchw)
  File "/home/fuzzybatman/.local/lib/python3.7/site-packages/tf2onnx/tfonnx.py", line 786, in process_tf_graph
    mapped_op, unmapped_op = tensorflow_onnx_mapping(g, continue_on_error, ops_mapping)
  File "/home/fuzzybatman/.local/lib/python3.7/site-packages/tf2onnx/tfonnx.py", line 552, in tensorflow_onnx_mapping
    raise ValueError("tensorflow op " + op + " is not supported")
ValueError: tensorflow op ReverseV2 is not supported
lucienwang1009 commented 5 years ago

ReverseV2 is usually used in dynamic_bidirectional_rnn. It should be eliminated by bi-lstm/gru rewriter before onnx op mapping. Put it in another way,

  1. If the unsupported ReverseV2 is in a bidirectional rnn, it should be a bug of bi-lstm/gru rewriter.
  2. Otherwise, it's a new usage for ReverseV2 that we should support in onnx op mapping.

So could you please confirm which case you encountered? If possible, please share the frozen pb file or the snapshot around the ReverseV2 op.

satyajithj commented 5 years ago

I believe it is a new usage. It is used in the Faster R-CNN example in the Tensorpack repo.

The frozen pb file is large (~170 MB). Do you want me to upload to gdrive and share a link? And by snapshot do you mean the graph image using tensorflow?

lucienwang1009 commented 5 years ago

Yes, it's indeed a missing TF op. Unlike ReverseSequence, ReverseV2 can perform on several dims of the input tensor. New ReverseSequence can only deal with the reverse on one dim. We might need multiple ReverseSequence to implement this ReverseV2 op. @nbcsm please add comments.

satyajithj commented 5 years ago

I have a question that is off topic. When I run the tensorflow-onnx script for conversion, I see that it raises a ValueError immediately after it detects an op that is not supported.

Is there already a utility script that analyzes the graph and lists all the unsupported ops that are present?

satyajithj commented 5 years ago

Nvm. I commented out the second half of the function tensorflow_onnx_mapping in tfonnx.py

The result is

ReverseV2
InvertPermutation
CropAndResize
lucienwang1009 commented 5 years ago

I have a question that is off topic. When I run the tensorflow-onnx script for conversion, I see that it raises a ValueError immediately after it detects an op that is not supported.

Is there already a utility script that analyzes the graph and lists all the unsupported ops that are present?

--continue_on_error is the option to ignore all unsupported ops and -v will output verbose log comprising unmapped ops: image

satyajithj commented 5 years ago

Ah. Thanks

satyajithj commented 5 years ago

@lucienwang1009 I'd like to learn and contribute. Where do I get started?

lucienwang1009 commented 5 years ago

@fuzzyBatman Welcome! The idea is adding a new mapping function that uses several ONNX ReverseSequence to simulate TF ReverseV2. As current ONNX ReverseSequence only supports the constant axis to perform reversing, the only case we can handle now is when the axis input of TF Reverse op is constant. Fortunately, it's how Faster R-CNN uses. You can take ReverseSequence mapping as an example to write your own Reverse mapping function: https://github.com/onnx/tensorflow-onnx/blob/a0ab29f166ba4c5143219d8ed739910df9836890/tf2onnx/onnx_opset/tensor.py#L1048 Finally, you are expected to add some unittests about this op in test_backend.py.

satyajithj commented 5 years ago

@lucienwang1009 Thank you! Which branch should I be working on?

lucienwang1009 commented 5 years ago

@fuzzyBatman you can fork our repo and submit PR to master branch.

satyajithj commented 5 years ago

Hey. Is it possible to run tests on a specific op?

lucienwang1009 commented 5 years ago

@fuzzyBatman Suppose you've added the test of ReverseV2, named test_reverse_v2, in test_backend.py, you can run it specifically by python test_backend.py BackendTests.test_reverse_v2. Since ReverseSequence is enabled in onnx opset 10, you might also need to set the opset version by set TF2ONNX_TEST_OPSET=10 (or export TF2ONNX_TEST_OPSET=10 on Linux).

satyajithj commented 5 years ago

Thank you! Another question. I want to discuss a little bit about the implementation.

Assuming a tensor of shape [1, 2, 3, 4], example on this page

With axes = 0, an identity block is all that is needed. With axes = 1, we require an RS block and a Const block (for the seq_len).

With axes = 2, we require a Split, a Const, 2 Reverse and a combine (do not know what Op to use)?

I am trying to generalise this. However, it will work only if the shape is known beforehand (no use of None). Or am I on the wrong track here?

lucienwang1009 commented 5 years ago

@fuzzyBatman You're very close to the general idea. Instead of using Split to reorganize the tensor, you can leverage Transpose to change the axis to be reversed.

Back to your cases, if the batch_axis and time_axis of RS are 1 and 0 respectively, i.e., we reverse the tensor along the axis 0. For different axis of tf.reverse: if axis=0, no additional transformation is needed, directly apply RS on the tensor. if axis=1, transpose the tensor with perm [1, 0, 2, 3] to a tensor of shape [2, 1, 3, 4], and reverse it on axis 0, and then transpose the tensor back with perm [1, 0, 2, 3]. if axis=2, transpose the tensor with the perm of [2, 1, 0, 3]. if axis=3, transpose the tensor with the perm of [3, 1, 2, 0].

It's much harder to deal with the case with more than one axis. I'd like to suggest you handle the scalar axis first.

satyajithj commented 5 years ago

@lucienwang1009 Hey. Thanks for the info. I believe it is done. I have it working for vector axis as well.

Here is the code. Can you try it out? I have added the tests for both constant and vector axis.

satyajithj commented 5 years ago

However, I do have a concern. Should I add a conditional check for if the axis vector is empty? In that case the tensor should be mapped to Identity. Right?

The Reverse op doc does state

The number of dimensions specified in axis may be 0 or more entries

satyajithj commented 5 years ago

Nvm. I added an identity block and added an empty axis test as well.

lucienwang1009 commented 5 years ago

@fuzzyBatman super! only two comments:

  1. axes of ReverseV2 can be negative.
  2. I wonder if we need Transpose after every RS. Can we deal with it at the final step, as Transpose is usually computational intensive? For instance, if the input is a 4-D tensor and axes = [1, 2, 3], we transpose the input by [1, 0, 2, 3], [2, 1, 0, 3] and [3, 1, 2, 0] for axis = 1, 2, 3 respectively. Finally, the tensor order that we can take down is [3, 0, 1, 2], and thus the final Transpose is with perm [1, 2, 3, 0].
satyajithj commented 5 years ago

@lucienwang1009

axes of ReverseV2 can be negative.

Wow. I forgot about that. Can I just flip the negative axis to its positive counterpart?

I wonder if we need Transpose after every RS. Can we deal with it at the final step, as Transpose is usually computational intensive? ...

Clever. I'll look into it.

lucienwang1009 commented 5 years ago

Wow. I forgot about that. Can I just flip the negative axis to its positive counterpart?

Yeah, it can be easily done by adding input rank.

satyajithj commented 5 years ago

Hey. I am having trouble using the transpose nodes. They seem to work only when a pair of axes are being interchanged. Using a permutation like [5, 1, 2, 3, 4, 0] works but not [0, 4, 3, 5, 1, 2].

For example:

OP=Transpose
Name=ReverseV2_Transpose__19
Inputs:
    ReverseV2_ReverseSequence__18:0=ReverseSequence, [4, 1, 2, 3], 1
Outpus:
    ReverseV2:0=[3, 4, 1, 2], 1

The original shape is [1, 2, 3, 4]. I used perm = [3, 0, 1, 2] on [4, 1, 2, 3] as above and it results in [3, 4, 1, 2].

Am I using the transpose block in the wrong manner?

satyajithj commented 5 years ago

Alright. I figured it out with some trial and error. I have pushed the new code. It accounts for

I added a test for a 1D tensor with empty axis (Uses the identity node).

I tried accounting for a 1D tensor with axis = [0], but it results in a runtime error stating rank(input) must be >= 2

satyajithj commented 5 years ago

@lucienwang1009 The ReverseSequence tests have tensors with unknown shapes (like [None, 5, 7, 8, 9]). Is this possible for ReverseV2 as well? The code I wrote will not work in that case.

lucienwang1009 commented 5 years ago

@fuzzyBatman Great!! Could you please submit your change as a PR so that we can talk about the implementation details there? If so, I will close the issue:)

satyajithj commented 5 years ago

PR created.

satyajithj commented 5 years ago

@lucienwang1009 Thank you for discussing and assessing along the way.

I would also like to implement InvertPermutation and CropAndResize. Can I open different issues for discussing those?

lucienwang1009 commented 5 years ago

@fuzzyBatman Contribution is welcome always! Free feel file issues.