pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.3k stars 6.96k forks source link

PyTorch standard Coco dataset (datasets.CocoDetection) not compatible with Faster R-CNN object detection model #8353

Open ranjaniocl opened 8 months ago

ranjaniocl commented 8 months ago

🐛 Describe the bug

Hi, I am trying to train and evaluate pre-trained Faster R-CNN model with standard coco dataset. I am getting the following error

TypeError: RandomIoUCrop() requires input sample to contain tensor or PIL images and bounding boxes. Sample can also contain masks.

Here are the high level steps

  1. Downloaded the COCO 2017 dataset
  2. Prepared PyTorch dataset using standard steps from https://pytorch.org/vision/main/auto_examples/transforms/plot_transforms_e2e.html#sphx-glr-auto-examples-transforms-plot-transforms-e2e-py
  3. Training and evaluating Faster R-CNN model using steps from https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html

Here is colab notebook https://colab.research.google.com/drive/1Tbu2Thf-thn0lLG12dM3bq_BMZDihX2Y?usp=sharing

Any help will be appreciated. Thanks.

Versions

Hi, I am trying to train and evaluate pre-trained Faster R-CNN model with standard coco dataset. I am getting the following error

TypeError: RandomIoUCrop() requires input sample to contain tensor or PIL images and bounding boxes. Sample can also contain masks.

Here are the high level steps

  1. Downloaded the COCO 2017 dataset
  2. Prepared PyTorch dataset using standard steps from https://pytorch.org/vision/main/auto_examples/transforms/plot_transforms_e2e.html#sphx-glr-auto-examples-transforms-plot-transforms-e2e-py
  3. Training and evaluating Faster R-CNN model using steps from https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html

Here is colab notebook https://colab.research.google.com/drive/1Tbu2Thf-thn0lLG12dM3bq_BMZDihX2Y?usp=sharing

Any help will be appreciated. Thanks.

NicolasHug commented 8 months ago

Hi @ranjaniocl ,

try to print the input that gets passed to RandomIoUCrop(). There should be bounging boxes and PILimages/tensors in there. If not, it's likely that the pipeline is incorrect.

ranjaniocl commented 8 months ago

Hi @NicolasHug,

Thank for looking into my issue. I do not know how to print input that get passed to RandomIoUCrop(). Can you please guide me with sample script/steps? Also, we do you mean when you say 'pipeline'. Is it the dataloader?

I tried to print a sample dataset in the notebook and it has tensors for image and bounding boxes.

NicolasHug commented 8 months ago

@ranjaniocl sorry it looks like your issue might be more in scope for https://discuss.pytorch.org/

ranjaniocl commented 8 months ago

@NicolasHug Ok. sure. I will try my luck there. As I am using standard dataset and Pytorch provided standard code, I thought someone here can look into it and provide some resolution.

ranjaniocl commented 8 months ago

@NicolasHug Just for reference, there was one similar issue reported in past. https://github.com/pytorch/vision/issues/2720

ranjaniocl commented 8 months ago

@NicolasHug I just logged it at PyTorch forum. While I was creating, similar issues from past popped up (please see links below). I do not see any response so I do not have much hope.

https://discuss.pytorch.org/t/training-faster-r-cnn-model-with-coco-dataset-has-been-consistently-unsuccessful/178023 https://discuss.pytorch.org/t/evaluate-pre-trained-faster-r-cnn-on-coco-dataset/157770

WortJohn commented 6 months ago

Help on class CocoDetection in module torchvision.datasets.coco:

class CocoDetection(torchvision.datasets.vision.VisionDataset) | CocoDetection(root: Union[str, pathlib.Path], annFile: str, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, transforms: Optional[Callable] = None) -> None

Note: the class has transform, target_transform and transforms arguments, passing value to transforms (not transform) can solve the issue for me.