airsplay / py-bottom-up-attention

PyTorch bottom-up attention with Detectron2
Apache License 2.0
229 stars 56 forks source link

Results different from the original bottom up attention #15

Closed zhangchenghua123 closed 4 years ago

zhangchenghua123 commented 4 years ago

Since Bottom Up Attention only provides GT Boxes and features, I run the model it provides to get the object category and attribute category of GT Boxes.Here is an example. image When I used the pre-training model you provided to extract the object categories and attribute categories of this image for given Gt-boxes, I found that some categories were different from those obtained in the original bottom up attention. image This picture is from MSCOCO2014/VAL2014/COCO_val2014_00000039185.jpg. If you have time, you can verify it. I would like to know why the two are different.

airsplay commented 4 years ago

Yea. It is caused by the distinctions between given-GT-box extraction and direct extraction. In short, they are not using the same bounding boxes to get the features: Given-GT-box uses boxes after box regression while direct extraction uses boxes before box regression.

I illustrate the reason in the README file: https://github.com/airsplay/py-bottom-up-attention#proof-of-correctness