Closed priyapaul closed 7 years ago
The masks are stored in the LMDB. The MSCOCO dataset has precisely annotated segmentation for individual people in the image and rough segmentation for crowds. We generate the binary mask using the segmentation for people who do not have keypoint annotations. The code for generating masks and LMDB is in here: https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation/tree/master/training
@ZheC label_vec, label_heat should have the same shape to calculate loss. In this link (https://github.com/CMU-Perceptual-Computing-Lab/caffe_train/blob/master/src/caffe/cpm_data_transformer.cp) , putVecMaps and putGaussianMap create PAFS and confidence maps respectively, what are their shapes? I think these are stored to the transformed label, what is its shape? How are they connected?
Is the following process flow correct? keypoints from data set --> gaussian-maps, Pafs ---> transform_label-->sliced to label_vec, label_heat
Is there anyway to understand these process better? I have the paper, but creation of labels from mere key points confuses me, particularly the PAF generation! Thanks in advance