Closed Julietty closed 4 years ago
The following implementation is helpful. It doesn't seem to be necessary to use all the key points. https://github.com/terryky/tfjs_webgl_app/tree/master/blazepose_fullbody
It is closed due to lack of progress.
In case this is still of interest: the model card might help a little.
Output(s) 33x3 array corresponding to (x, y, visibility). X, Y coordinates are local to the region of interest and range from [0.0, 255.0]. Visibility is in the range of [min_float, max_float] and after user-applied sigmoid denotes the probability that a keypoint is located within the frame. It does not indicate whether the keypoint is occluded by another body part.
although various sources state that a) there are in fact 39kp (33+6 "technical" ones describing center, rotation and ROI) b) each keypoint has in fact 4 values (x, y, z, visibility), but z-estimation is currently not really well implemented and should not be used in production.
Hi @Laubeee @PINTO0309 , I am facing same problem in Visualization of those 156 keypoints . Can you please help me out
I actually get even more, I get 195 outputs, which corersponds to 5values for 39 keypoints. I have not yet found out what the 5th value is for. I have been looking into other things since, but initial results for display were more or less OK when interpreting the 195 values as 39x5 array and taking the first two columns as X/Y values. If you have 156, just use 39x4, it should be fine.
Cool @Laubeee ! Let me try it then :).
I have tried plotting the points as suggested by @Laubeee but seems like the points are not accurate.
def convert_preds_to_xy(preds):
kpts = []
temp = preds[2][0]
for x,y in zip(temp[::4] , temp[1::4]):
kpts.append((int(x),int(y)))
return kpts
When I run the script 11_pose_landmark_full_body_tflite2h5_weight_int_fullint_float16_quant.py
which generates 2 .H5 file which are
full_pose_landmark_39kp.h5
full_pose_detection.h5
Also I have lot of -ve points as a predictions
and I am using full_pose_landmark_39kp.h5
. I believe I am not using the right weights file , @PINTO0309 can you please share/suggest me the right weights file ?
Looks like you figured it in that other thread, might I ask what was the problem?
Hi @Laubeee ! I had 2 problems . The very first one you have solved ( I had 156 points as an predicted output , so I took X/Y pairs as you suggested and it worked ) The 2nd one is visualization. I have developed a code snippet , you can access it from here : https://github.com/PINTO0309/PINTO_model_zoo/issues/76#issuecomment-788895829
Hi @Laubeee ! Perhaps you can help me with my issue. I used BlazePose and I'm training on my new specific dataset (30 k labeled images), but after some epochs and usage of Early Stopping, I have obtained a model, which get me in output (on test set) always the same values (numbers of coordinates) for any image. What do you think about this? Perhaps, you can propose some ways for the solution of this issue.
Hi! Thank you for your great work! I'm confused about BlazePose Output where (156,) numbers like
[ 2.52126999e+02, 6.02675476e+01, 0.00000000e+00, 7.85891479e+02, 1.81462738e+02, 1.63871140e+02, 0.00000000e+00, 7.20563904e+02, 1.45734039e+02, 1.29187439e+02, 0.00000000e+00, 6.98395630e+02, 2.13491409e+02, 1.06495354e+02, 0.00000000e+00, 6.22234802e+02, 2.00276703e+02, -8.05518913e+00, 0.00000000e+00, 8.24156921e+02, 2.26737579e+02, -6.08142509e+01, 0.00000000e+00, 8.02759888e+02, 3.44873230e+02, 1.73129330e+01, 0.00000000e+00, 7.70141296e+02, 1.80556412e+02, 2.63250000e+02, 0.00000000e+00, 5.80166565e+02, 3.85322113e+02, -1.70522858e+02, 0.00000000e+00, 7.14230835e+02, 6.90424728e+01, 1.44556747e+02, 0.00000000e+00, 6.49112549e+02, 1.49639694e+02, 1.06226685e+02, 0.00000000e+00, 7.19676697e+02, 5.28476562e+02, 3.08210297e+02, -1.69180872e+03, 5.81274231e+02, 1.55742020e+02, 3.07312897e+02, -1.85214661e+03, 5.41536011e+02, 9.28919907e+01, 8.45592224e+02, 1.27794141e+03, 2.13438950e+02, 1.48164871e+02, 5.72311401e+02, 1.61038025e+03, 3.95598450e+02, 3.03308502e+02, 3.77423492e+02, -7.68191895e+02, 2.65322083e+02, 4.57906586e+02, 2.28914776e+01, -1.81853455e+03, 4.70178528e+02, 6.31962402e+02, 2.88600311e+02, 0.00000000e+00, 2.67264465e+02, 5.21761108e+02, -1.34646225e+02, 0.00000000e+00, 4.36134796e+02, 5.85655136e+01, -6.74091949e+01, 0.00000000e+00, 2.82592468e+02, 6.13506226e+02, -1.69087173e+02, 0.00000000e+00, 5.03378204e+02, -7.62357254e+01, -9.83965836e+01, 0.00000000e+00, 3.04275391e+02, 6.24550049e+02, -3.54933510e+01, 0.00000000e+00, 4.93076843e+02, 5.50927551e+02, 3.69620392e+02, 7.18008652e+01, 4.75950956e+00, 2.44607971e+02, 4.69858917e+02, -8.90234451e+01, -1.11216354e+02, 2.67619415e+02, 5.19302246e+02, 2.35148270e+02, 2.89389679e+02, 3.65628471e+01, 3.67190094e+02, 4.64473145e+02, 8.65795364e+01, -1.73825119e+02, 6.05172424e+02, 1.30989319e+03, 2.43793793e+02, 1.52740601e+02, 8.13551758e+02, 1.99037500e+03, 1.52300385e+02, -1.58977951e+02, 6.85409485e+02, 0.00000000e+00, 2.29356506e+02, 5.41917419e+02, 8.15468506e+02, 0.00000000e+00, 1.49431351e+02, 1.45869461e+02, 8.23752747e+02, 0.00000000e+00, 2.91660767e+02, 3.70319244e+02, 1.30325195e+03, 0.00000000e+00, 2.07150436e+02, 4.17732330e+02, 4.39149231e+02, 1.05031455e+00, 1.83442783e+01, 8.89772263e+01, -2.31506516e+02, -7.93924236e+00, 2.01322327e+01, 3.84161530e+02, -4.48814926e+01, 9.11639392e-01, 2.05853485e+02, 2.75295074e+02, 2.07653046e+02, -2.28759232e+01, 2.15975464e+02, 5.05869934e+02, -4.85139046e+01, -1.24414492e+01, 5.11680725e+02, 3.77195312e+02, 9.83366470e+01, -2.68878841e+01, 5.35371887e+02]
I understand that in some way they describe each of 39kp, but they don't have normalization and I don't understant their meaning. Would you please help with it? How convert they in usual coordinates? Or what do they mean? Thanks!