Question about the number of detected box 关于检测出来的框的数量的疑问

NingYuanLin commented 2 years ago

Hi,author. Thank for your work! I train a model and validate it based on my datasets, which are around 30 object per image, and get a perfect mAP value(~0.90). However, I find the val log(projects/crowd-e2e-sparse-rcnn/output/inference/coco_instances_results.json) shows that it generate about 500 bouding box per image. I'm very confused about why it have a better mAP but there are high gap between the number of bounding box and the ground truth.

作者您好，首先感谢您的工作。我在我自己的数据集（平均每张图像上有30个目标左右）上进行了模型的训练并验证，并且map值比较不错（\~0.90）。但是我发现验证的输出文件(projects/crowd-e2e-sparse-rcnn/output/inference/coco_instances_results.json)里对每个图片都输出了500个左右的检测框，我不明白为啥输出的box与groud truth的数量差距那么多，但是map值却可以这么高。谢谢\~

yexiguafuqihao commented 2 years ago

Please refer to DETR and Anchor DETR for the analysis of queries. There are 500 queries for sparse objects in an image. Thus there must be several queries with a low confidence score predicted by the query-based detector since they may locate the background, which can be filtered by a confidence score threshold easily.

NingYuanLin commented 2 years ago

Thanks for your detailed explanation.
I will check out these two articles.
Thanks.

megvii-research / Iter-E2EDET

Question about the number of detected box 关于检测出来的框的数量的疑问 #5