Bug in mAP/mAR calculation for reference object detection tutorial

adamjstewart commented 3 years ago

🐛 Bug

I believe I've found a bug in the mAP/mAR evaluation code for the TorchVision Object Detection Finetuning Tutorial.

To Reproduce

If you run the Jupyter Notebook version of the tutorial on Colab and make any of the following modifications, you can reproduce this bug.

In the training loop, add evaluate(model, data_loader, device=device) to evaluate performance on the training set, or
When defining dataset_test, use get_transform(train=True) instead of train=False, or
Always add RandomHorizontalFlip in get_transform

Expected behavior

Whether or not RandomHorizontalFlip is added shouldn't have a substantial impact on mAP/mAR performance. However, I've noticed that adding RandomHorizontalFlip results in significantly lower mAP/mAR scores (~0.2 instead of ~0.7).

Environment

This is the default environment on Google Colab:

PyTorch version: 1.8.1+cu101 Is debug build: False CUDA used to build PyTorch: 10.1 ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.5 LTS (x86_64) GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 Clang version: 6.0.0-1ubuntu2 (tags/RELEASE_600/final) CMake version: version 3.12.0

Python version: 3.7 (64-bit runtime) Is CUDA available: True CUDA runtime version: 11.0.221 GPU models and configuration: GPU 0: Tesla T4 Nvidia driver version: 460.32.03 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5 /usr/lib/x86_64-linux-gnu/libcudnn.so.8.0.4 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.0.4 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.0.4 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.0.4 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.0.4 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.0.4 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.0.4 HIP runtime version: N/A MIOpen runtime version: N/A

Versions of relevant libraries: [pip3] numpy==1.19.5 [pip3] torch==1.8.1+cu101 [pip3] torchsummary==1.5.1 [pip3] torchtext==0.9.1 [pip3] torchvision==0.9.1+cu101 [conda] Could not collect

adamjstewart commented 3 years ago

For anyone else who runs into this problem, my current workaround is to use the following:

dataset.transforms = get_transform(train=False)
evaluate(model, data_loader, device=device)
evaluate(model, data_loader_test, device=device)
dataset.transforms = get_transform(train=True)

This allows me to disable RandomHorizontalFlip during evaluation and re-enable it afterwards.

fmassa commented 3 years ago

Hi,

Sorry for missing this.

The underlying issue is that, if you have random transformations in your test dataset, the conversion function https://github.com/pytorch/vision/blob/cadb1681799bab9aa1b53685ccb16ba6fb1cd96c/references/detection/coco_utils.py#L146-L195 will pick one instantiation of the random transforms during the creation of the COCO structure https://github.com/pytorch/vision/blob/cadb1681799bab9aa1b53685ccb16ba6fb1cd96c/references/detection/engine.py#L80 , and will select a different one during the model evaluation itself https://github.com/pytorch/vision/blob/cadb1681799bab9aa1b53685ccb16ba6fb1cd96c/references/detection/engine.py#L84

I would say that except if we modify even further the evaluation scripts to stop relying entirely on the cocoapi, we might not be able to do much better than that.

9527-csroad commented 1 year ago

I‘m learning pytorch according to the official tutorial, and I find a problem. It seems a little question. I think there is no need to open a new issue. Follow the steps of https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html, I find the value of area=small is incorrect. I dont know why there are some incorrect values while others seem right. (I have checked my code which is the same as the tutorial code).

9527-csroad commented 1 year ago

I have rerun the code sevaral times, the results are same. My leader want me to close the server due to its too noisy and influence his conversation. After a while, I opened it and checked the code again. I dont find any problem, and I rerun the code. To my suprise, the results seems no problem. It's puzzling!!!

pytorch / vision