hellojialee / Improved-Body-Parts

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation
https://arxiv.org/abs/1911.10529
258 stars 42 forks source link

Joints heatmaps #18

Closed bhack closed 4 years ago

bhack commented 4 years ago

Hi I want to ask you what we have in the multi-person joints heatmap generated with the heatmapper. Is It just a gaussian around each joint location so that the same semantic joint (i.e. left shoulder) is on the same heatmap channel for all the human targets in the scene but at different x,y location?

So could you vectorize the joint heatmapper emitter i.e. with render gaussian? Cause I see you have many loop there with numpy code and so I am guessing if It could be vectorized with some Pytorch ops.

hellojialee commented 4 years ago

Yes, you are right.

The code has already used python slice and np.outer trick, thus the speed is fast enough for preparing training data. I will consider your advice if I have time. Thank you for your sharing!

bhack commented 4 years ago

Thanks for your reply. I have another question have you experimented with Spatial softargmax for the joints? I am asking this cause I know that this is a reference impl of you paper but a found an interesting trick recently in https://arxiv.org/abs/2003.07543 that could be interesting also with joints.

hellojialee commented 4 years ago

No, I have not. Thank you for your sharing, looks like they use different encoding method try to get more accurate response and introduce scale awareness. The idea of Gaussian response refinement is frequently visited. The encoding, heatmap upsampling and direct argmax can restore the joints location within 1 pixel offset. However, the main bottleneck in our project is object scale variance.

bhack commented 4 years ago

However, the main bottleneck in our project is object scale variance.

Yes exactly.. it is why It was interesting if the Scale adaptive soft-argmax operator trick could cover you variance also cause the number of the (discrete) scale channels could be adapted.

hellojialee commented 4 years ago

Thank you for your advice. And please let me know if you discover more :)

bhack commented 4 years ago

Yes in their case It Is still not clear to me how they handle the multi-modal peak inside the same heatmap channel with a soft-argmax.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.