HuangJunJie2017 / UDP-Pose

Official code of The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation
Apache License 2.0
307 stars 54 forks source link

The final result is produced by "gaussian" or "offset"? #15

Closed fourierer closed 3 years ago

fourierer commented 3 years ago

Sorry to disturb you , the form of heatmap or loss depend on the parameter "config.MODEL.TARGET_TYPE",the result in your paper(such as SOTA) is produced by which target type, "gaussian" or "offset"?

HuangJunJie2017 commented 3 years ago

@fourierer 'offset' to be more exactly 'combined', which means the combination of classfication(one heatmap/response map) and regression(two offsetmaps)

fourierer commented 3 years ago

@HuangJunJie2017 Thank you for your reply. I got another tow questions: (1) In /lib/core/inference.py, there is a function called get_final_preds, you use an inverse hessian matrix multiply a grad vector to evaluate the shift between the predict results and the ground truth. What‘s the mathematics principles of this? (In HRNet or SimpleBaseline, the author use 0.25 to evaluate the shift) (2) In get_final_preds function, the GaussianBlur process is done after the get_max_preds function, whatif there is a big outlier in the predict result? The get_max_preds will get the wrong points. So I think the GaussianBlur should be used in the get_max_preds function. I just have my own opinion :)

HuangJunJie2017 commented 3 years ago

@fourierer 1 https://github.com/HuangJunJie2017/UDP-Pose/blob/master/deep-high-resolution-net.pytorch/lib/core/inference.py 'inverse hessian matrix multiply a grad vector' is used in function 'post' , which is used for gaussian heatmap target instead of combined target. you can refer to https://github.com/ilovepose/DarkPose for the mathematics principles of this operation.

  1. as explained in our paper for combined target, GaussianBlur is used first to convert the 0/1 distribution heatmap into a gaussian distribution heatmap. The motivation is to make the maximum response located near the ground truth position. Gaussian Blur after get_max_preds is meaningless.
fourierer commented 3 years ago

So, if we choose the TARGET_TYPE as "gaussian", the post function is done after the get_max_preds, which is to say, the GaussianBlur is done after get_max_preds, this is meaningless.

HuangJunJie2017 commented 3 years ago

@fourierer
if we choose the TARGET_TYPE as "gaussian", GaussianBlur is meaningful and effective for performance boosting. But it is not matter where to do this, before or after get_max_pred is all ok. As the final precision is determined by the darkpose post-processing. Gaussian Blur the response map after get_max_preds is meaningless for 'combined' type target is because the post is used offset map only, So Gaussian Blur on the offset maps is useful as it smooths the offset maps.

fourierer commented 3 years ago

OK, I got it. Thank you~~