ZheC / Realtime_Multi-Person_Pose_Estimation

Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)
Other
5.1k stars 1.37k forks source link

Label generation from keypoints -process flow and shape #26

Closed priyapaul closed 7 years ago

priyapaul commented 7 years ago

@ZheC label_vec, label_heat should have the same shape to calculate loss. In this link (https://github.com/CMU-Perceptual-Computing-Lab/caffe_train/blob/master/src/caffe/cpm_data_transformer.cp) , putVecMaps and putGaussianMap create PAFS and confidence maps respectively, what are their shapes? I think these are stored to the transformed label, what is its shape? How are they connected?

Is the following process flow correct? keypoints from data set --> gaussian-maps, Pafs ---> transform_label-->sliced to label_vec, label_heat

Is there anyway to understand these process better? I have the paper, but creation of labels from mere key points confuses me, particularly the PAF generation! Thanks in advance

ZheC commented 7 years ago

The masks are stored in the LMDB. The MSCOCO dataset has precisely annotated segmentation for individual people in the image and rough segmentation for crowds. We generate the binary mask using the segmentation for people who do not have keypoint annotations. The code for generating masks and LMDB is in here: https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation/tree/master/training