last-one / Pytorch_Realtime_Multi-Person_Pose_Estimation

Pytorch version of Realtime Multi-Person Pose Estimation project
MIT License
216 stars 70 forks source link

not sure about correctness of the result #21

Open KaiWU17TUM opened 6 years ago

KaiWU17TUM commented 6 years ago

result

above is the result I got after running the test_pose.py with the converted pytorch model. there are many blue points and I'm wondering if the algorithm is working correctly. And what do these blue dots mean here?

My python version is 3.5 and I've adapted the original code according to another issue "Sorry, please Python3 version". I'm running on CPU and it takes about 40s to finish the processing. Do you know how long will it take when using cuda?

Thank you in advance.

last-one commented 6 years ago

I don't know why there are so many blue (purple?) dots. The purple dot means right ear. You could try if don't draw the eye and ear. I don't think the model have a good performance on eye and ear.

And the time of test depends on the test size and the number of the candidate dots. In general, when using GPU and not changing the test size, it takes around 2~5s per image.

KaiWU17TUM commented 6 years ago

How can this network work realtime when it takes several seconds to process one single frame?

KaiWU17TUM commented 6 years ago

and can I delete something in the limbSeq and mapIdx to avoid calculating keypoints of eyes and ears? which part of body do elements in the limb sequence stand for?

lzj322 commented 6 years ago

@KaiWU17TUM , I met the same situation. In some pictures, the model I trained using this code would predict a lot of random blue points. I wonder where the bug is? How do you address this issue?

ybai62868 commented 6 years ago

@KaiWU17TUM @lzj322 I meet the same situation... Do you know how to solve this problem?

guist commented 6 years ago

I also get a similar result with the original image:

result

last-one commented 6 years ago

make sure the test image has substracted mean and been divided by std. What's more, pay attention to the padValue.

last-one commented 6 years ago

And please make sure limbSeq could correspond to the mapIdx.

guist commented 6 years ago

I tried following normalize functions:

def normalize(origin_img):
    origin_img = np.array(origin_img, dtype=np.float32)
    origin_img -= np.mean(origin_img)
    origin_img /= np.std(origin_img, ddof=1)
    return origin_img

and

def normalize(origin_img):
    origin_img = np.array(origin_img, dtype=np.float32)
    origin_img -= 128.0
    origin_img /= 256.0
    return origin_img

but both give the same result.

I did not change any other variable:

padValue = 0.
limbSeq = [[3,4], [4,5], [6,7], [7,8], [9,10], [10,11], [12,13], [13,14], [1,2], [2,9], [2,12], [2,3], [2,6], \
           [3,17],[6,18],[1,16],[1,15],[16,18],[15,17]]

mapIdx = [[19,20],[21,22],[23,24],[25,26],[27,28],[29,30],[31,32],[33,34],[35,36],[37,38],[39,40], \
          [41,42],[43,44],[45,46],[47,48],[49,50],[51,52],[53,54],[55,56]]

Should I somehow modify them?

last-one commented 6 years ago

The value of limbSeq and mapIdx depend on your own tasks. The limbSeq means the pair points of paf and the mapIdx corresponds to the paf's index of heatmap.

leonlulu commented 6 years ago

When I run the example "ski" image, I got exactly the same result as @guist . How should I modify to make the result right? Many Thanks

ybai62868 commented 6 years ago

There are two reasons for this problem:

  1. the value you padding, 128 or 0
  2. the order of the limSeq should be consistent with the last-one rather than the original version of openpose. @leonlulu
leonlulu commented 6 years ago

Thanks for the quick reply @ybai62868 I didn't change anything in 'test_pose.py' before I ran the test shell. So let me see... The padding value I use is 0 and the limSeq is the same as last-one's I guess.

xind commented 6 years ago

Hi @ybai62868 and @leonlulu Do you have any solutions about this issue?

xind commented 6 years ago

I just make it works as expected by replacing some lines with OpenPose's code. Revised parts:

# find connection in the specified sequence, center 29 is in the position 15
limbSeq = [[2,3], [2,6], [3,4], [4,5], [6,7], [7,8], [2,9], [9,10], \
           [10,11], [2,12], [12,13], [13,14], [2,1], [1,15], [15,17], [1,16], \
           [16,18], [3,17], [6,18]]
# the middle joints heatmap correpondence
mapIdx = [[31,32], [39,40], [33,34], [35,36], [41,42], [43,44], [19,20], [21,22], \
          [23,24], [25,26], [27,28], [29,30], [47,48], [49,50], [53,54], [51,52], \
          [55,56], [37,38], [45,46]]

for part in range(19-1):
puppywst commented 6 years ago

@guist @KaiWU17TUM I guess you have taken the official openpose model. Or model converted from caffe. For official openpose, background is at the 18-th channel, nose at the 0-th channel, but in this repo, background is at the 0-th channel. Thus you get many noses (actually background points after nms).