ZheC / Realtime_Multi-Person_Pose_Estimation

Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)
Other
5.1k stars 1.37k forks source link

Is it possible to remove the PAF branch? #71

Open zhengthomastang opened 7 years ago

zhengthomastang commented 7 years ago

Hi, I am a PhD student at the University of Washington. I am currently working on a project that requires realtime 2D pose estimation (for only one human object) on mobile platform. Therefore, I need to reduce the computation as much as possible. Since there is always only one human object in our scenes, we think there is no need for the PAF branch to distinguish body parts from multiple people.

How can we modify "setLayers.py" in training to achieve this purpose (remove the PAF branch)? I tried to remove all '$' and '@' layers, and change all 'C2' and 'L2' layers into 'C' and 'L' respectively, but it still failed (the bottom shapes do not match). Do I need to modify other things than "setLayers.py"?

P.S. I have tested on "Convolutional Pose Machines" as well, but the performance and computation efficiency are not as good, so I want to try modify OpenPose if it is possible.

RuWang15 commented 7 years ago

I am also trying to do this. I deleted all the 'L1' layers and 'vec' layers in the prototxt, and I want to use single-person data to train it. But I failed using the lmdb of MPI generated in the repo of 'Convolutional Pose Machines'. Do you have any progress? Please let me know, thank you!

ouceduxzk commented 6 years ago

to make it running on mobile, you need to think about model compression, quantization or use another backbone like mobilenet instead of vgg19, PAF is necessary in this bottom up approach

anatolix commented 6 years ago

If you just want to remove layers you could do it in this project(in new-generation branch) https://github.com/anatolix/keras_Realtime_Multi-Person_Pose_Estimation/blob/new-generation/config.py Just set limb_from and libs_to to [].

But: 1) I am unsure what will be in result, may be PAFs influence quality 2) I am now training net with only 2 layers ["HeadCenter","Background"] and training takes almost same time as full network, I think it because 90% of work is VGG