How to undersatand the criterion？

michalfaber / keras_Realtime_Multi-Person_Pose_Estimation

Keras version of Realtime Multi-Person Pose Estimation project

Other

780 stars 372 forks source link

How to undersatand the criterion？ #48

Open hellojialee opened 6 years ago

hellojialee commented 6 years ago

in demo.ipynb：

`score_midpts = np.multiply(vec_x, vec[0]) + np.multiply(vec_y, vec[1])

            score_with_dist_prior = sum(score_midpts)/len(score_midpts) \
                                    + min(0.5*oriImg.shape[0]/norm-1, 0)

            criterion1 = len(np.nonzero(score_midpts > param['thre2'])[0]) > 0.8 * len(score_midpts)
            # parm['thre2'] = 0.05
            criterion2 = score_with_dist_prior > 0
            if criterion1 and criterion2:
                connection_candidate.append([i, j, score_with_dist_prior, 
                                             score_with_dist_prior+candA[i][2]+candB[j][2]])`

How to understand the value of “min(0.5*oriImg.shape[0]/norm-1, 0)”？ The tow probability terms are added directly. And how to understand the two criterion？？

michalfaber commented 6 years ago

Hi @USTClj Good question. My understanding of the term min(0.5*oriImg.shape[0]/norm-1, 0) is that it penalizes candidate connections with larger distance between relevant body parts. It makes sense in some scenarios. Lets imagine that there are multiple arms belonging to different people quite close to each other (sorry for my bad photoshop skills). The candidate connections black and blue may have the same score (we are sampling along the connection - approximation of Eq. 10) but obviously the blue one is wrong. If we subtract a small value from the score of blue candidate, the black ones will become stronger candidates. How to determine this value ? The authors of original implementation decided that connections longer than half of the image height will be penalized. The range of the term is from -0.5 (length=height) to 0 (less than half the height)

anatolix commented 6 years ago

The authors of original implementation decided that connections longer than half of the image height will be penalized.

Btw while training they scale main person to be approximately image size(368 pix in our case). But after it they do random scaling 0.6-1.1. So this is very logical network never learned libs(and PAFs) could be larger than half of image.

hellojialee commented 6 years ago

@michalfaber @anatolix Thank you for your great help. Much clearer now. Btw, dose the criterion 1 judge the consistent of the paf, which implicates the directions of pixels on the limb？ Another puzzle is that the “score_with_dist_prior”　and the heap map values of canA and canB are added directly without the consideration of their magnitude. I think they should be weighted by different factors.