Nicholasli1995 / EvoSkeleton

Official project website for the CVPR 2020 paper (Oral Presentation) "Cascaded deep monocular 3D human pose estimation wth evolutionary training data"
https://arxiv.org/abs/2006.07778
MIT License
333 stars 44 forks source link

What is S15678 #14

Closed ben-xD closed 4 years ago

ben-xD commented 4 years ago

Thanks for your nice work, the paper is a nice read.

I do not understand this: "We use S15678 as our initial population". What is S15678?

ben-xD commented 4 years ago

Could you also explain about the camera intrinsics, $Kp_j$. How does 3D pose ($p_j$) multiplied by K give you the 2D coordinates. I know you mean you're doing 3D→2D projection, but I don't know how this multiplication is helpful

ben-xD commented 4 years ago

From a h36m data fetcher code:

    Params:
        subject: The subject number in 1, 5, 6, 7, 8, 9 and 11.
        action: The action number in the range between 1 and 16.
        sub_action: The sub-action number 1 or 2.
        camera: The camera number in the range between 1 and 4.

So I don't see how S15678 actually works here.

ben-xD commented 4 years ago

In the paper that you said you followed (Generating Multiple Hypotheses for 3D Human Pose Estimation with Mixture Density Network), they mention

For the Human3.6M dataset, we follow the standard protocol of using S1, S5, S6, S7 and S8 for training, and S9 and S11 for testing

Therefore, now I know you just meant a combination of S1, S5, S6, S7 and S8. This was a little challenging to decipher as not everyone has access to the Human3.6M, as they've made it very difficult to attain this dataset. If anyone has this dataset, please let me know.