ZheC / Realtime_Multi-Person_Pose_Estimation

Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)
Other
5.08k stars 1.37k forks source link

Data structure in lmdb #96

Open michalfaber opened 7 years ago

michalfaber commented 7 years ago

I am trying to understand what is going on in the cpm_data_transformer. I've found that each image is stored in lmdb as a data structure: 6 x width x height - 3 channels, metadata, miss mask and all mask. Why is metadata encoded in such a big buffer? Is it for performance reasons? Did you try to generate augmented dataset with just images and labels (vec and heat) before training ?

anatolix commented 6 years ago

My wild guess is they where writing custom caffe layer and were limited in types of data layer could receive, so the encoded everything in additional image layer.