keras-team / keras-cv

Industry-strength Computer Vision workflows with Keras
Other
1.01k stars 330 forks source link

Bug: VectorizedBaseImageAugmentationLayer fails to process unbatched bounding_boxes #1512

Closed james77777778 closed 1 year ago

james77777778 commented 1 year ago

I find that current implementation of VectorizedBaseImageAugmentationLayer fails to correctly process unbatched bounding_boxes

Here is a standalone script:

import tensorflow as tf
from keras_cv.layers import preprocessing

if __name__ == "__main__":
    # construct unbatched input (images, bounding_boxes)
    images = tf.zeros([8, 8, 3])
    bounding_boxes = {
        "boxes": tf.ragged.constant(
            [[0.0, 0.0, 0.0, 0.0], [1.0, 1.0, 1.0, 1.0]],
            dtype=tf.float32,
        ),
        "classes": tf.RaggedTensor.from_tensor(tf.zeros([2, 1])),
    }
    inputs = {"images": images, "bounding_boxes": bounding_boxes}
    # any vectorized layers, take RandomZoom as example
    layer = preprocessing.RandomZoom(0.5, 0.5)
    outputs = layer(inputs, training=True)  # raises ValueError

Reason: VectorizedBaseImageAugmentationLayer tries to tf.expand_dims (tf.squeeze) all values of inputs (outputs) including bounding_boxes which is a dict

https://github.com/keras-team/keras-cv/blob/4fd3a84cb84666644ee52bcc0625f0e4416564dd/keras_cv/layers/preprocessing/vectorized_base_image_augmentation_layer.py#L368-L370

https://github.com/keras-team/keras-cv/blob/4fd3a84cb84666644ee52bcc0625f0e4416564dd/keras_cv/layers/preprocessing/vectorized_base_image_augmentation_layer.py#L392-L394

The possible solution:

            # _format_inputs
            for key in list(inputs.keys()):
                if key == BOUNDING_BOXES:
                    inputs[BOUNDING_BOXES]["boxes"] = tf.expand_dims(
                        inputs[BOUNDING_BOXES]["boxes"], axis=0
                    )
                    inputs[BOUNDING_BOXES]["classes"] = tf.expand_dims(
                        inputs[BOUNDING_BOXES]["classes"], axis=0
                    )
                else:
                    inputs[key] = tf.expand_dims(inputs[key], axis=0)
                ...
            # _format_output
            for key in list(output.keys()):
                if key == BOUNDING_BOXES:
                    output[BOUNDING_BOXES]["boxes"] = tf.squeeze(
                        output[BOUNDING_BOXES]["boxes"], axis=0
                    )
                    output[BOUNDING_BOXES]["classes"] = tf.squeeze(
                        output[BOUNDING_BOXES]["classes"], axis=0
                    )
                else:
                    output[key] = tf.squeeze(output[key], axis=0)

It needs to be fixed if we want to implement augment_bounding_boxes for vectorized layers.

I can open the PR once approved.

james77777778 commented 1 year ago

This bug should influence #1439 @soma2000-lang

The unit test should fail at VectorizedBaseImageAugmentationLayer part https://github.com/keras-team/keras-cv/blob/4fd3a84cb84666644ee52bcc0625f0e4416564dd/keras_cv/layers/preprocessing/random_crop_and_resize_test.py#L191-L205

soma2000-lang commented 1 year ago

@james77777778 thanks for flagging ,yes some of the tests are failing due to this.

LukeWood commented 1 year ago

Go ahead @james77777778 - thanks for filing!