Open ghost opened 3 years ago
Hi @tareq992403 ,
Thanks for reaching out!
You’re correct! This project is largely a combination of the ideas in each paper, with some other additional architecture tricks applied on the head of the model.
Please let me know if this helps or you have any questions.
Best, John
Hi @jaybdub,
Thanks for the quick reply. As the code does not have detailed comments, can please share any documentation or blog on this implementation? This will help us to understand the code better.
Thanks, Tareq
After the backbone pretrained RenNet18, do you feed the features to the two-branch multi-stage CNN for heatmap and PAF generation as shown in Fig. 3 of "Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields" (https://arxiv.org/pdf/1611.08050.pdf)?
I am confused. Can anyone help? I saw several requests about the explanation of this architecture, so this will help many of us.
Thanks, Tareq
@tareq992403 you should read this article https://www.geeksforgeeks.org/openpose-human-pose-estimation-method/
@tucachmo2202 Thx for the link. The tutorial you shared is on explaining the paper "Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields" (https://arxiv.org/pdf/1611.08050.pdf)."
However this trt_pose GitHub implementation is DIFFERENT from this paper. As mentioned by the author of this GitHub John in this thread "This project is largely a combination of the ideas in each paper, with some other additional architecture tricks applied on the head of the model."
This work is a mixture of "Simple Baselines for Human Pose Estimation and Tracking" (https://arxiv.org/pdf/1804.06208.pdf) and "Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields" (https://arxiv.org/pdf/1611.08050.pdf)."
I am having hard-time understanding this implementation. After the backbone pretrained ResNet18, I don't see the two-branch multi-stage CNN for heatmap and PAF generation in the code. Can anyone help?
@tareq992403 you should see common.py file and resnet.py. The model return two-branch.
@tucachmo2202 Thanks, it is making more sense.
Here my understanding: We load the pre-trained model weights to the backbone resnet18. On top of the backbone resnet18, there are two CNNs to generate heatmap and PAF.
My question is where are we TRAINING these two CNNs that are on top of the backbone resnet18?
Thanks.
Hi @tareq992403. OpenPose released a newer paper https://arxiv.org/abs/1812.08008. Are there any advantages of the two-branch approach ((https://arxiv.org/pdf/1611.08050.pdf) over the one branch (https://arxiv.org/abs/1812.08008)?
Hi,
I read the paper "Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields" (https://arxiv.org/pdf/1611.08050.pdf) and it does not mention about resnet18 or densenet121. I was wondering why this implementation needs pre-trained resnet18 or densenet121 model?
Or is this implementation based on the paper "Simple Baselines for Human Pose Estimation and Tracking" (https://arxiv.org/pdf/1804.06208.pdf). The code seems to use the idea of Part Affinity Fields as mentioned in the first paper.
Can you please clarify?
Thanks for the great work! Tareq