TensorFlow Object Detection: Null annotations raises exception

Andres-San commented 1 year ago

1. The entire URL of the file you are using

https://github.com/tensorflow/models/blob/master/research/object_detection/core/box_list.py

2. Describe the bug

Hi there, nice to meet you all,

Context [aws-sagemaker-object detection]

I've been trying to train an Object Detection Model (using Built-in Algoritms, Tensorflow) following the jumpstart sagemaker examples as template, but as soon as I provide a null annotation sagemaker fails to train when calling fit(), and throws the following error directly from tensorflow (at the end of the post is the entire traceback)

File "/opt/ml/code/object_detection/core/box_list.py", line 55, in __init__ raise ValueError("Invalid dimensions for box data: {}".format(boxes.shape)) ValueError: Invalid dimensions for box data: (0,)

Great!, So what the issue here? If I provide a dataset with equal n° of images and annotations, thats great and I can train sucessfully my model, but if I have an image with no object in it, the error raises, for instance:

0001.jpeg 1 has object and the corresponding annotation

0002.jpeg has no object, so it doest have an annotation

So the annotation.json file looks like:

{ "images": [ { "file_name": "0001.jpeg", "height": 1944, "width": 2592, "id": "0001" }, { "file_name": "0002.jpeg", "height": 1944, "width": 2592, "id": "0002" } ], "annotations": [ { "image_id": "0001", "bbox": [ 688, 371, 1859, 1581 ], "category_id": 0 } ] }

As far as i could investigate, this is the standrad proccedure when no object is available, I've also downloaded an entire annotation coco dataset 2017 to make sure of this (you can download "2017 Train/Val annotations [241MB]" and search of instances_val2017.json and look for files with ID 25593, 41488, 42888 ... and you'll see that there are the images ones, but not the annotations ones)

3. Steps to reproduce

Train any object detection garden model with these annotation files { "images": [ { "file_name": "0001.jpeg", "height": 1944, "width": 2592, "id": "0001" }, { "file_name": "0002.jpeg", "height": 1944, "width": 2592, "id": "0002" } ], "annotations": [ { "image_id": "0001", "bbox": [ 688, 371, 1859, 1581 ], "category_id": 0 } ] }

4. Expected behavior

Should not raise an error when no annotations are available for certain image

5. Additional context

Traceback (most recent call last): File "/opt/ml/code/transfer_learning.py", line 246, in run_with_args(args) File "/opt/ml/code/transfer_learning.py", line 201, in run_with_args train_and_save_model( File "/opt/ml/code/train.py", line 130, in train_and_save_model validation_losses = run_validation(detection_model, validation_data, batch_size, image_size, epoch) File "/opt/ml/code/validation.py", line 25, in run_validation losses_dict = model.loss(prediction_dict, shapes) File "/opt/ml/code/object_detection/meta_architectures/ssd_meta_arch.py", line 824, in loss ) = self._assign_targets( File "/opt/ml/code/object_detection/meta_architectures/ssd_meta_arch.py", line 1013, in _assign_targets groundtruth_boxlists = [box_list.BoxList(boxes) for boxes in groundtruth_boxes_list] File "/opt/ml/code/object_detection/meta_architectures/ssd_meta_arch.py", line 1013, in groundtruth_boxlists = [box_list.BoxList(boxes) for boxes in groundtruth_boxes_list] File "/opt/ml/code/object_detection/core/box_list.py", line 55, in init raise ValueError("Invalid dimensions for box data: {}".format(boxes.shape)) ValueError: Invalid dimensions for box data: (0,) 2023-03-09 21:59:57,046 sagemaker-training-toolkit INFO Waiting for the process to finish and give a return code. 2023-03-09 21:59:57,047 sagemaker-training-toolkit INFO Done waiting for a return code. Received 1 from exiting process. 2023-03-09 21:59:57,048 sagemaker-training-toolkit ERROR Reporting training FAILURE 2023-03-09 21:59:57,048 sagemaker-training-toolkit ERROR ExecuteUserScriptError: ExitCode 1 ErrorMessage "raise ValueError("Invalid dimensions for box data: {}".format(boxes.shape)) ValueError: Invalid dimensions for box data: (0,)" Command "/usr/local/bin/python3.9 transfer_learning.py --batch_size 5 --beta_1 0.9 --beta_2 0.999 --early_stopping False --early_stopping_min_delta 0.0 --early_stopping_patience 5 --epochs 10 --epsilon 1e-07 --initial_accumulator_value 0.1 --learning_rate 0.001 --momentum 0.9 --optimizer adam --reinitialize_top_layer Auto --rho 0.95 --train_only_top_layer False" 2023-03-09 21:59:57,048 sagemaker-training-toolkit ERROR Encountered exit_code 1

6. System information

TensorFlow version 2:
Python version 3.9
GPU model and memory (using some AWS machines)

laxmareddyp commented 1 year ago

Hi @Andres-San,

we have official version for coco conversion here,which probably should handle null cases and you can use this tutorial for reference and please let me know if it is solves your problem.

Thanks.

Andres-San commented 1 year ago

Thanks @laxmareddyp

I've come to the conclusion that is not a TensorFlow issue, its about sagemaker not being crystal-clear on the annotations format so parsing from sagemaker to TF is not clear enough. I've posted the detail here #3853, if someone is interested.

Also, not even sagemaker examples object detection works for me at least, so that is another hint in the same direction, #3852.

Best!

laxmareddyp commented 1 year ago

Thanks for quick response and closing this as completed.if you would like to discuss further, please feel free to reopen the issue here.

google-ml-butler[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No

tensorflow / models