keras-team / keras-cv

Industry-strength Computer Vision workflows with Keras
Other
1.01k stars 330 forks source link

Bounding Box Cropping Layer: Feature Request #2417

Closed Michael-Blackwell closed 6 months ago

Michael-Blackwell commented 7 months ago

Short Description

Hello!

I would like to propose implementing a vectorized bounding box cropping layer that would accept an image (or batch of images) and a tensor of boxes, then return a ragged tensor of cropped bounding boxes from the original image [batch, n_boxes, none, none, 3] (since each crop will have a different h/w).

Keras_CV already has cropping functions for preprocessing/augmentation. But there are no layers to efficiently crop multiple bounding box/s from an image. This functionality is a requirement to build 2-stage detectors for high-resolution images where the graph looks something like this:

Step 4 from the outline above depends on a robust bounding box cropping layer. The closest implementation I have found is TensorFlow's tf.image.crop_and_resize. The only draw-back to tf.image.crop_and_resize is the resizing step does not preserve the aspect ratio. However, keras_cv.layers.Resizing seems to have some pretty robust resizing options and accepts ragged tensors.

Due to limitations in Pytorch, I have to use a for loop to crop the bounding boxes, and in TensorFlow's tf.image.crop_and_resize the resizing options are limited. This is an opportunity for Keras to offer a functionality that is lacking in other frameworks but needed to build a specific class of models.

Papers

Multi-Stage-CV-Detection

Existing Implementations

The best implementation I could find is tf.image.crop_and_resize, but again, the resizing options are limited.

Other Information

divyashreepathihalli commented 6 months ago

@Michael-Blackwell Thanks for filing the issue. This is a good custom layer for users. We dont have enough request for this feature yet. I will be closing the issue.