pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.11k stars 6.94k forks source link

Investigate decrease in mAP in retinanet resnet50 model #4437

Open prabhat00155 opened 3 years ago

prabhat00155 commented 3 years ago

🐛 Describe the bug

Reference: https://github.com/pytorch/vision/pull/4409#issuecomment-920723588

cc @datumbox

prabhat00155 commented 3 years ago

@datumbox This is the result I got for retinanet_resnet50_fpn model(which seems to match the published result):

IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.364
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.558
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.385
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.194
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.400
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.483
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.312
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.501
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.540
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.336
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.587
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.691
Training time 4:17:43

Training logs: retinanet_run1.txt retinanet_run2.txt

datumbox commented 3 years ago

@prabhat00155 I understand from your logs that you have not validated the pre-trained model using the existing weights but instead you trained a new model. To test if the already published weights work, you pass the --pretrained --test-only flags.

prabhat00155 commented 3 years ago

I see. Yeah, now I get this:

IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.363
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.557
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.382
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.193
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.400
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.490
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.314
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.500
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.540
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.340
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.581
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.696
fmassa commented 3 years ago

Note: this could potentially be due to the change of defaults in FrozenBatchNorm https://github.com/pytorch/vision/pull/2933

The reported results in that PR matches the one we get here, and could be a reasonable explanation for the difference

datumbox commented 3 years ago

@fmassa This is not it unfortunately. PR #2933 indeed dropped the mAP by 0.1 but then #2940 ensured the issue is addressed for pre-trained models.

In the original issue, I provide reference to the PR #3032 which is later than both aforementioned PRs and reports mAP to 36.4. As you can see the specific PR contains the eps patch: https://github.com/pytorch/vision/blob/4ab46e5f7585b86fb2befdc32d22d13635868c4e/torchvision/ops/misc.py#L54

And the overwrite: https://github.com/pytorch/vision/blob/4ab46e5f7585b86fb2befdc32d22d13635868c4e/torchvision/models/detection/_utils.py#L355