Does the model always output 100 detection results on one image?

ZM-Zhou commented 2 years ago

Hi, Thanks for the great work! When I tried to use the AnchorDETR on my own dataset, I found that the PostProcess() in models/anchor_detr.py always selected the top-100 predictions as the final results for an input image. Because of using the focal loss (which seems to follow the Deformable-DETR), there was no non-object class in the model, so the 100 results were all the objects. Did I misunderstand something? However, only several objects per image on my dataset, so the 100 predicted objects would lead to a very low 'precision'. In the practical application, how could I select the 'real' objects from all the outputs? Maybe set a confidence threshold based on the score?

tangjiuqi097 commented 2 years ago

Hi, it does not relate to the focal loss. The model outputs 100 detection results because the COCO metric (mAP) always uses 100 results. If you use other metrics, e.g. precision and recall, you can select the prediction whose confidence score is higher than a threshold.

ZM-Zhou commented 2 years ago

Got it! Thanks for your reply.

megvii-research / AnchorDETR

Does the model always output 100 detection results on one image? #41