Closed ben-xD closed 4 years ago
Could you also explain about the camera intrinsics, $Kp_j$. How does 3D pose ($p_j$) multiplied by K give you the 2D coordinates. I know you mean you're doing 3D→2D projection, but I don't know how this multiplication is helpful
From a h36m data fetcher code:
Params:
subject: The subject number in 1, 5, 6, 7, 8, 9 and 11.
action: The action number in the range between 1 and 16.
sub_action: The sub-action number 1 or 2.
camera: The camera number in the range between 1 and 4.
So I don't see how S15678 actually works here.
In the paper that you said you followed (Generating Multiple Hypotheses for 3D Human Pose Estimation with Mixture Density Network), they mention
For the Human3.6M dataset, we follow the standard protocol of using S1, S5, S6, S7 and S8 for training, and S9 and S11 for testing
Therefore, now I know you just meant a combination of S1, S5, S6, S7 and S8. This was a little challenging to decipher as not everyone has access to the Human3.6M, as they've made it very difficult to attain this dataset. If anyone has this dataset, please let me know.
Thanks for your nice work, the paper is a nice read.
I do not understand this: "We use S15678 as our initial population". What is S15678?