conversion of `tf.tensor_scatter_nd_update` fails #2081

Open FabianSchuetze opened 1 year ago

FabianSchuetze commented 1 year ago

I run the following function in a tf.while_loop and including on line relating to tf.tensor_scatter_nd_update leads to a tf2onnx conversion failure. The function is (excerpts of it):

      def func(B, keep, out_boxes, out_scores, idx, step):
            i = B[0]
            out_boxes = tf.tensor_scatter_nd_update(
                    [[step, idx, 0], [step, idx, 1], [step, idx, 2],
                     [step, idx, 3], [step, idx, 4]],
                    [boxes[step, i, 0], boxes[step, i, 1], boxes[step, i, 2],
                     boxes[step, i, 3], boxes[step, i, 4]])
            idx += 1
            if tf.size(B) == 1:
                B = tf.zeros([0], dtype='int32')
                iou = ArIoU(tf.reshape(boxes[bs, i, :], (-1, 5)),
                            tf.gather(boxes[bs], B[1:]), 3.14)
                iou = tf.reshape(iou, (-1,))
                inds = tf.reshape(tf.where(tf.less(iou, threshold)), (-1,))
                B = tf.gather(B, inds + 1)
            return B, keep, out_boxes, out_scores, idx, step

Commenting out this line

            out_boxes = tf.tensor_scatter_nd_update(
                    [[step, idx, 0], [step, idx, 1], [step, idx, 2],
                     [step, idx, 3], [step, idx, 4]],
                    [boxes[step, i, 0], boxes[step, i, 1], boxes[step, i, 2],
                     boxes[step, i, 3], boxes[step, i, 4]])

makes all the difference whether I can convert the tf frozen graph to onnx. The stacktrace is:

2022-11-14 13:36:19,352 - INFO - Using tensorflow=2.10.0, onnx=1.12.0, tf2onnx=1.13.0/2c1db5
2022-11-14 13:36:19,352 - INFO - Using opset <onnx, 17>
Traceback (most recent call last):
  File "/home/schuetze/.local/lib/python3.8/site-packages/tensorflow/python/framework/importer.py", line 500, in _import_graph_def_internal
    results = c_api.TF_GraphImportGraphDefWithResults(
tensorflow.python.framework.errors_impl.InvalidArgumentError: slice index 0 of dimension 0 out of bounds. for '{{node retinanet_obb/filtered_detections/while/body/_1/while/body/_106/strided_slice}} = StridedSlice[Index=DT_INT32, T=DT_INT32, _output_shapes=[<unknown>], begin_mask=0, ellipsis_mask=0, end_mask=0, new_axis_mask=0, shrink_axis_mask=1](Func/retinanet_obb/filtered_detections/while/body/_1/while/body/_106/input/_181, retinanet_obb/filtered_detections/while/body/_1/while/body/_106/strided_slice/stack, retinanet_obb/filtered_detections/while/body/_1/while/body/_106/strided_slice/stack_1, retinanet_obb/filtered_detections/while/body/_1/while/body/_106/strided_slice/stack_2)' with input shapes: [0], [1], [1], [1] and with computed input tensors: input[1] = <0>, input[2] = <1>, input[3] = <1>.

What I am trying to do here is to do nms with oriented bounding boxes and fill the tensor out_boxes at position (step, idx, : ) with elements from the incoming predicted bounding boxes at position (step, i, : ) with i being the box that is a box with the highest score not suppressed by other boxes.

The code runs well through a evaluation session in tensorflow and can generate output without problems. Unfortunately, the error (coming eventually from tensorflow) is cryptic to me and I do not understand there errors pertaining to the invalid shapes.

Does anybody have an idea how I could fix these this issue? Judging from this issue I am not sure if tf.tensor_scatter_nd_update` is supported. If not, can you recommend a ideom to achieve the same effect (Insert elements into a fixed size tensor) ?

fatcat-z commented 1 year ago

We do support tf.tensor_scatter_nd_update() function now. Could you please update the example code including data preparation and caller so we can have a local debug for further information? Thanks.

FabianSchuetze commented 1 year ago

Thanks a lot for your kind and informative reply, @fatcat-z . Your answer provide me with hope that we can find a solution to the problem. I have create a one-layer network that provokes the same error as reported above. The tensorflow frozen graph is generated with the following function:

import os
import numpy as np
import tensorflow as tf
from tensorflow.python.framework.convert_to_constants import (

def ArIoU(bx1, bx2, angle_modulo):
    cos_ph = tf.math.cos(- 1 * tf.gather(bx1, 4, axis=1))
    sin_ph = tf.math.sin(- 1 * tf.gather(bx1, 4, axis=1))
    rx = cos_ph * (tf.gather(bx2, 0, axis=1) - tf.gather(bx1, 0, axis=1))\
        - sin_ph * (tf.gather(bx2, 1, axis=1) - tf.gather(bx1, 1, axis=1))
    ry = sin_ph * (tf.gather(bx2, 0, axis=1) - tf.gather(bx1, 0, axis=1))\
        + cos_ph * (tf.gather(bx2, 1, axis=1) - tf.gather(bx1, 1, axis=1))
    iw = tf.minimum(rx + tf.gather(bx2, 2, axis=1), tf.gather(bx1, 2, axis=1))\
        - tf.maximum(rx - tf.gather(bx2, 2, axis=1), -
                     1 * tf.gather(bx1, 2, axis=1))
    iw = tf.maximum(iw, 0)
    ih = tf.minimum(ry + tf.gather(bx2, 3, axis=1), tf.gather(bx1, 3, axis=1))\
        - tf.maximum(rx - tf.gather(bx2, 3, axis=1), -
                     1 * tf.gather(bx1, 3, axis=1))
    # for all ih < 0 values, the results should be 0
    area_i = 4 * tf.gather(bx1, 2, axis=1) * tf.gather(bx1, 3, axis=1)
    area_j = 4 * tf.gather(bx2, 2, axis=1) * tf.gather(bx2, 3, axis=1)
    ua = area_i + area_j - iw * ih
    angle = tf.abs(tf.gather(bx1, 4, axis=1) - tf.gather(bx2, 4, axis=1))
    zero = tf.constant(0.0, tf.float32)
    ar = tf.maximum(zero, tf.cos(tf.minimum(angle, angle_modulo - angle)))
    res = tf.where(ih > 0, ar * iw * ih / ua, iw)
    return res

def nms_ar_tf(boxes, scores, nms_threshold, score_threhold=0.05):
    """Sort confidence scores of predicted bounding boxes."""
    threshold = tf.constant(nms_threshold, tf.float32)
    bs = len(boxes)
    out_boxes = tf.zeros((bs, 300, 5), dtype=tf.float32)
    out_scores = tf.zeros((bs, 300), dtype=tf.float32)
    for step in range(bs):
        idx = 0
        scores_sorted = tf.sort(
            tf.gather(scores, step, axis=0),
            axis=-1, direction='DESCENDING')
        B = tf.argsort(
            tf.gather(scores, step, axis=0),
            axis=-1, direction='DESCENDING')
        B = tf.boolean_mask(B, tf.greater(scores_sorted, score_threhold))

        def func(B, out_boxes, out_scores, idx, step):
            i = B[0]
            out_boxes = tf.tensor_scatter_nd_update(
                [[step, idx, 0], [step, idx, 1], [step, idx, 2],
                 [step, idx, 3], [step, idx, 4]],
                [boxes[step, i, 0], boxes[step, i, 1], boxes[step, i, 2],
                 boxes[step, i, 3], boxes[step, i, 4]])
            out_scores = tf.tensor_scatter_nd_update(
                out_scores, [[step, idx]], [scores[step, i]])
            idx += 1
            if tf.size(B) == 1:
                B = tf.zeros([0], dtype='int32')
                iou = ArIoU(tf.reshape(boxes[step, i, :], (-1, 5)),
                            tf.gather(boxes[step], B[1:]), 3.14)
                iou = tf.reshape(iou, (-1,))
                inds = tf.reshape(tf.where(tf.less(iou, threshold)), (-1,))
                B = tf.gather(B, inds + 1)
            return B, out_boxes, out_scores, idx, step

        def cond(a, b, c, d, e): return tf.size(a) > 0
        B, out_boxes, out_scores, idx, step =\
            tf.while_loop(cond, func,
                          (B, out_boxes, out_scores, idx, step),
                                            out_scores.shape, None, None))
    return out_boxes, out_scores

class SmallNetwork(tf.keras.layers.Layer):
    """Keras layer for filtering detections using score threshold and NMS"""

    def __init__(
        self.nms_threshold = nms_threshold
        self.score_threshold = score_threshold
        self.max_detections = 300

    def call(self, inputs, **kwargs):
        boxes = inputs[0]
        classes = inputs[1]
        boxes2 = tf.reshape(boxes, (tf.shape(boxes)[0], -1, 5))
        scores2 = tf.reshape(classes, (tf.shape(classes)[0], -1))
        outputs = nms_ar_tf(
        new_outputs = (tf.cast(outputs[0], tf.float32),
                       tf.cast(outputs[1], tf.float32))
        return new_outputs

    def compute_output_shape(self, input_shape):
        return [
            (input_shape[0][0], self.max_detections, 5),
            (input_shape[1][0], self.max_detections),

    def compute_mask(self, inputs, mask=None):
        """Required in Keras when there is more than 1 output"""
        return (len(inputs) + 1) * [None]

    def get_config(self):
        config = super(SmallNetwork, self).get_config()
                "nms_threshold": self.nms_threshold,
                "score_threshold": self.score_threshold,
                "max_detections": self.max_detections,
        return config

def freeze_graph(model):
    # Convert Keras model to ConcreteFunction
    full_model = tf.function(lambda input_1: model(input_1))
    full_model = full_model.get_concrete_function(
        (tf.TensorSpec(model.inputs[0].shape, model.inputs[0].dtype),
         tf.TensorSpec(model.inputs[1].shape, model.inputs[1].dtype)))
    out_path = os.path.join('/tmp')
    frozen_func = convert_variables_to_constants_v2(full_model)

def test_filter_detections(network):
    # two boxes with skew IoU = 0.41 and ar IoU 0.24
    boxes = tf.constant(
        np.array([[0, 0, 20, 20, 0],
                 [0, 0, 40, 10, 0.25 * np.pi],
                 [0, 0, 38, 10.5, 0.25 * np.pi]
    boxes = tf.reshape(boxes, (1, 3, 5))
    scores = tf.constant(
        np.array([[0.9], [1], [0.5]]), dtype=tf.keras.backend.floatx()
    scores = tf.reshape(scores, (1, 3))
    res = network((boxes, scores))
    return res

if __name__ == "__main__":
    i1 = [tf.keras.Input(shape=(None, 5)), tf.keras.Input(shape=(None,))]
    small = SmallNetwork()
    outputs = small(i1)
    model = tf.keras.Model(inputs=i1, outputs=outputs)

Running this script generates a frozen_graph at /tmp/frozen_inference_graph.pb. Then, I can attempt to convert the network to onnx with:

python3 -m tf2onnx.convert --input /tmp/frozen_inference_graph.pb --output /tmp/model.onnx --outputs Identity:0,Identity_1:0 --inputs input_1:0,input_1_1:0 --opset 13

The conversion fails with the following output (similar to the output I wrote above):

Do you have idea what might causes this problem?

Update 1 Sorry, there was a small bug in the implementation of the network that came to light when I run the network on test data. I corrected this bug but the conversion error still remains the same. You can see the update code snipped and a test case above.