michalfaber / keras_Realtime_Multi-Person_Pose_Estimation

Keras version of Realtime Multi-Person Pose Estimation project
Other
780 stars 372 forks source link

Why you take only one person #127

Open rafikg opened 5 years ago

rafikg commented 5 years ago

Hi @michalfaber https://github.com/michalfaber/keras_Realtime_Multi-Person_Pose_Estimation/blob/b595cfbb35dffe5abe2cb4cf6a1bde4d0986d125/training/dataflow.py#L245-L250 I am not sure here that I understand what you were doing. You process all the persons in each image and after some filters (areas, number of keypoints,...), you can get more than one persons in the persons list. However, you take only the first person (persons[0]) with all the keypoints (keypoints), but you throw the rest of the persons. At the end, it means we have only one bbox, one scale, one center but we have more than one person represented by (all_joints). It is really confusing! Could you explain, please?

michalfaber commented 5 years ago

Hi @DeeperDeeper The idea is that we sort visible persons by area - we got this information from the annotations. We skip a person if the distance to the previous one (in this sorted list) is too small. The COCO dataset contains keypoints of persons with quite a large scale difference, which is not handled well by the model. Taking only the most significant person in the image is the easiest simplification. Of course, there is an option to upscale other persons to reduce the problem of scale difference. I didn’t explore this path yet, but the code is ready for an additional preprocessing for each person in the image.