andres-fr / realtime-pose-estimation

Real-time, multi-person human keypoint estimation
Other
8 stars 1 forks source link

Map between body keypoints of persons from annotations and predictions of body keypoints from code, when there are more than two persons #2

Open sana15 opened 3 years ago

sana15 commented 3 years ago

In val2017 images, in an image there are many persons, suppose in one image of id "304404" there are 30 persons, annotations are present for only 13 persons, my code predicted 23 persons, so how can i map keypoints of persons from annotation file and predictions from code. A help would be appreciated, thanks in advance pckh_14

andres-fr commented 3 years ago

Hi Sana,

Thank you very much for the issue. Unfortunately I won't have time to get back to the system in a few weeks, but I hope I can help you anyway.

Are you using the Python COCO API for evaluation? https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/cocoeval.py Did you check the COCO paper? https://arxiv.org/pdf/1405.0312.pdf

In section 4.3, it says: "For the purpose of evaluation, areas marked as crowds will be ignored and not affect a detector’s score".

They talk about segmentation, but my guess without looking at the code in much detail is that the keypoint metrics also ignore the crowds, and anything above ~15 people is simply masked out for practical reasons. This means, it is not bad that your system finds the extra people, but only the annotated will count towards the evaluation. If you aren't familiar with the masks I would encourage you to go through the COCO webpage or the papers to see how it rolls.

Could you send a screenshot of the corresponding mask? Something like in the first image here: https://aferro.dynu.net/work/human_pose_estimation/ this way we could confirm this guess. Alternative explanations would require a more careful look at the code and COCO paper