microsoft / computervision-recipes

Best Practices, code samples, and documentation for Computer Vision.
MIT License
9.34k stars 1.16k forks source link

[ASK] Can't fit DetectionLearner because some bounding boxes have not positive height and width #646

Open xcsob opened 3 years ago

xcsob commented 3 years ago

Description

I created a dataset using Microsoft Labeling Tool Vott, and exported it as a Pascal Format.

I'm trying to train a Faster RCNN using this notebook (https://github.com/microsoft/computervision-recipes/blob/master/scenarios/detection/01_training_introduction.ipynb).

All the bounding box in my dataset are in the format [xmin, ymin, xmax, ymax] and there aren't any situations where xmin >= xman or ymin >= ymax.

Although I can import the dataset and visualize it, when I try to fit the DetectionLearner on it I get the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-76-0c6531329277> in <module>
----> 1 detector.fit(EPOCHS, lr=LEARNING_RATE, print_freq=30, skip_evaluation=skip_evaluation)

/mnt/batch/tasks/shared/LS_root/mounts/clusters/azure-machine-learning/code/Users/anbosco/computervision-recipes/utils_cv/detection/model.py in fit(self, epochs, lr, momentum, weight_decay, print_freq, step_size, gamma, skip_evaluation)
    534                 self.device,
    535                 epoch,
--> 536                 print_freq=print_freq,
    537             )
    538             self.losses.append(logger.meters["loss"].median)

/mnt/batch/tasks/shared/LS_root/mounts/clusters/azure-machine-learning/code/Users/anbosco/computervision-recipes/utils_cv/detection/references/engine.py in train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq)
     28         targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
     29 
---> 30         loss_dict = model(images, targets)
     31 
     32         losses = sum(loss for loss in loss_dict.values())

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    887             result = self._slow_forward(*input, **kwargs)
    888         else:
--> 889             result = self.forward(*input, **kwargs)
    890         for hook in itertools.chain(
    891                 _global_forward_hooks.values(),

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/torchvision/models/detection/generalized_rcnn.py in forward(self, images, targets)
     90                     raise ValueError("All bounding boxes should have positive height and width."
     91                                      " Found invalid box {} for target at index {}."
---> 92                                      .format(degen_bb, target_idx))
     93 
     94         features = self.backbone(images.tensors)

ValueError: All bounding boxes should have positive height and width. Found invalid box [242.96875, 196.09375, 242.96875, 196.875] for target at index 1.

I seems that there is an invalid box, where the xmin is equal to xmax. However, as I shown above, I haven't this bbox in my trainint set.

Do you mind what this means and why it is happening?

Other Comments