How to convert 2D poses of my own image into the network input?

qizhen816 commented 4 years ago

Hi, recently I've been comparing different 3D human pose estimation methods on my own dataset. While applying SemGCN it seems like the method need 3D ground truth points and camera params to generate 2D pose input. I tried to scaled my 2D human pose according to the image resolution and normalized to [-1, 1], but the 3D result is still really bad. So is there any method I can follow to preprocess the 2D poses of my own image? （回中文也行）

garyzhao commented 4 years ago

@qizhen816 Thanks for your interest in our work.

I think the main problem should be camera settings of Human3.6m. If they are largely different from your own dataset, you have to re-train the model by your data if you have the 3D ground truth.

Otherwise, you may use a different way to normalize 2D human poses in Human3.6m to make it be capable with your own data, and then re-train the model by Human3.6m. For example, you can follow the way of our original paper: crop the pose according to the bounding box and then centered 2D pose w.r.t. the root joint. However, you may lose the scale of the skeleton in this way.

Please note that if your 2D poses are produced by Stacked Hourglass, you should use the model trained with SH. And in this repo, the center of the image is [0, 0] when doing normalization.

Best, Long

qizhen816 commented 4 years ago

Thanks @garyzhao for answering in a such short time! It turns out the bad 3D predictions are results of my self-coded model loading part. I added strict=True to the load_state_dict and it missed all network weights. Then the model always gives me random result. :sweat_smile::sweat_smile::sweat_smile: I'm such a silly coder!

garyzhao commented 4 years ago

Never mind. Glad to know that you solve it. :)

garyzhao / SemGCN

How to convert 2D poses of my own image into the network input? #2