keras-team / keras-cv

Industry-strength Computer Vision workflows with Keras
Other
1k stars 332 forks source link

RetinaNet `model.fit` fails if it encounters a training input with zero boxes #2488

Open thomas-coldwell opened 2 weeks ago

thomas-coldwell commented 2 weeks ago

Current Behavior:

We are currently utilising the RetinaNet pretrained model which we are then transfer learning onto a custom dataset. In this dataset we have an adapted jittered resize layer that checks after performing the crop if at least X% of the area of the original bounding box is left in the image otherwise it will then exclude this bounding box (we are using #2484 to achieve this). This works great in isolation when we are testing with the jittered resize demo in Keras CV, however, during training (specifically at the end of the first epoch when it goes to run the validation) it then fails with an out of bounds exception e.g. indices[1,53275] = 0 is not in [0, 0) [[{{node retina_net_label_encoder_1/GatherV2_1}}]] (I've attached the full stack trace below)

stacktrace.txt

From what I can understand of the stack trace it specifically fails here: https://github.com/keras-team/keras-cv/blob/ba2556c98d2e122c6b7c9b2f6a94097548a8ee8a/keras_cv/src/models/object_detection/retinanet/retinanet_label_encoder.py#L138

So I think somehow this gather function might not work if the input has zero elements as would be the case here. If I set the minimum_box_area_ratio to 0% (so it doesn't exclude anything) then it trains normally as before but it just seems to be that setting this to anything non-zero will then prune some boxes but if there is any training example with zero then it causes this exception.

Expected Behavior:

Should be able to pass in training examples to the RetinaNet model with zero boxes in and it should continue training regardless. Or maybe there is then a mechanism to skip training examples without any labelled boxes present

Steps To Reproduce:

  1. Apply the changes linked for the adapted jittered resize (its a very minor change that adds to the bounding_box.clip_to_image function)
  2. Create a jittered resize layer and set the minimum_box_area_ratio to say 0.5
  3. Then attempt to train the RetinaNet model with this JitteredResize acting on the training dataset
  4. Observe the same exception

Version:

Latest off of master

Anything else:

thomas-coldwell commented 1 week ago

Just to add to this I've tried the following alternative approach too: