Pretrained weights transfer to d2 evaluation result not same

luohao123 commented 2 years ago

Hi, I found pretrained weights can transfer into d2 version, but the evaluation is not same:

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.391
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.579
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.419
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.204
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.438
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.555
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.326
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.534
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.601
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.343
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.665
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.833
[11/22 15:46:16 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs   |  APm   |  APl   |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 39.102 | 57.949 | 41.949 | 20.385 | 43.796 | 55.472 |
[11/22 15:46:16 d2.evaluation.coco_evaluation]: Per-category bbox AP: 
| category      | AP     | category     | AP     | category       | AP     |
|:--------------|:-------|:-------------|:-------|:---------------|:-------|
| person        | 51.545 | bicycle      | 30.120 | car            | 37.343 |
| motorcycle    | 41.354 | airplane     | 64.191 | bus            | 64.393 |
| train         | 66.267 | truck        | 30.056 | boat           | 24.152 |
| traffic light | 21.103 | fire hydrant | 64.068 | stop sign      | 59.821 |
| parking meter | 44.773 | bench        | 23.255 | bird           | 30.513 |
| cat           | 72.177 | dog          | 66.586 | horse          | 57.632 |
| sheep         | 51.321 | cow          | 55.309 | elephant       | 63.624 |
| bear          | 72.234 | zebra        | 66.314 | giraffe        | 68.215 |
| backpack      | 9.502  | umbrella     | 38.094 | handbag        | 10.717 |
| tie           | 28.595 | suitcase     | 37.492 | frisbee        | 62.511 |
| skis          | 22.858 | snowboard    | 32.119 | sports ball    | 35.770 |
| kite          | 36.131 | baseball bat | 30.962 | baseball glove | 32.334 |
| skateboard    | 50.383 | surfboard    | 34.870 | tennis racket  | 46.793 |
| bottle        | 29.972 | wine glass   | 31.864 | cup            | 38.972 |
| fork          | 35.053 | knife        | 14.528 | spoon          | 13.116 |
| bowl          | 36.399 | banana       | 19.322 | apple          | 19.359 |
| sandwich      | 34.859 | orange       | 27.864 | broccoli       | 19.962 |
| carrot        | 16.043 | hot dog      | 37.709 | pizza          | 49.721 |
| donut         | 45.727 | cake         | 37.040 | chair          | 25.170 |
| couch         | 42.012 | potted plant | 24.735 | bed            | 44.549 |
| dining table  | 28.283 | toilet       | 59.902 | tv             | 55.940 |
| laptop        | 59.224 | mouse        | 53.997 | remote         | 22.803 |
| keyboard      | 47.817 | cell phone   | 30.583 | microwave      | 54.825 |
| oven          | 33.776 | toaster      | 27.688 | sink           | 31.062 |
| refrigerator  | 56.144 | book         | 7.453  | clock          | 45.249 |
| vase          | 30.723 | scissors     | 26.979 | teddy bear     | 46.190 |
| hair drier    | 11.617 | toothbrush   | 18.464 |                |        |
[11/22 15:46:17 d2.engine.defaults]: Evaluation results for coco_2017_val in csv format:
[11/22 15:46:17 d2.evaluation.testing]: copypaste: Task: bbox
[11/22 15:46:17 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[11/22 15:46:17 d2.evaluation.testing]: copypaste: 39.1024,57.9486,41.9489,20.3847,43.7956,55.4719

I wonder why there is a gap between then ?

And I found the, it should using res5 output as transformer input in single scale, but I forcely using res2 as input of transformer, but result I got almost same AP...

Do u know why? This is very weired.

tangjiuqi097 commented 2 years ago

@luohao123 Hi, there may be three possibilities you should check.

Does the weight match the model? R-50 or R101? C5 or DC5?
The PostProcess follows the Deformable DETR but not the DETR.
You should evaluate with 1 img/card as the models are trained without image padding. If you want to evaluate with multiple images per card, you can random pad the images with a few pixels in training.

I do not know what happened in your code. You can print the shape of the feature in the transformer and I guess it may be still the previous feature.

And I found the, it should using res5 output as transformer input in single scale, but I forcely using res2 as input of transformer, but result I got almost same AP...

luohao123 commented 2 years ago

@tangjiuqi097 How, now I get 41.7 mAP almost same C5.

github-actions[bot] commented 2 years ago

This issue is not active for a long time and it will be closed in 5 days. Feel free to re-open it if you have further concerns.

megvii-research / AnchorDETR

Pretrained weights transfer to d2 evaluation result not same #23