anibali / h36m-fetch

Human 3.6M 3D human pose dataset fetcher
Apache License 2.0
367 stars 48 forks source link

what's the different between Poses_D3_Positions_mono_universal and Poses_D3_Positions_mono_universal? #16

Closed bucktoothsir closed 4 years ago

bucktoothsir commented 4 years ago

it seems that they stores the same value

anibali commented 4 years ago

You just wrote the same thing twice.

manurare commented 4 years ago

I guess he meant between Poses_D3_Positions_mono and Poses_D3_Positions_mono_universal. And also what the difference among those two and Poses_D3_Positions?

anibali commented 4 years ago

From their README:

The parametrizations we provide are 3D positions in the original coordinate space (D3_Positions) and transformed for monocular prediction using the camera parameters (D3_Positions_mono). We also provide 3D Angles for monocular prediction (D3_Angles_mono) and projections of the skeleton onto the image plane (D2_Positions). Lastly we provide 3D positions using the same limb lengths for all subjects (D3_Positions_mono_universal) as a 3D position parametrization that is invariant to subject size. The skeleton information is provided in the metadata.xml file that is delivered with our code.

So D3_Positions is in world space, D3_Positions_mono is in camera space (origin == camera position), D3_Positions_mono_universal has poses scaled to a universal size.

EveningLin commented 1 year ago

@anibali I guess they may be in different coordinate systems. D3_Positions is the world coordinate system, and D3_Positions_mono is in the camera coordinate system?

anibali commented 1 year ago

@EveningLin Yes

poincarelee commented 1 year ago

@anibali I still donnot understand D3_Positions_mono_universal. Can you kindly explain "using the same limb lengths for all subjects" ? And in what situation it will be used?

anibali commented 1 year ago

It means that the skeletons are "normalised" to a common scale (e.g., a tall person and short person will have similar limb lengths). It's up to you whether this is something you want to use.

poincarelee commented 1 year ago

Thanks a lot for your quick reply.
"a tall person and short person will have similar limb lengths" wouldn't this cause something wrong? Does this mean the tall person is normalized to the common scale and other body parts also normalized propotionally? Can I take it as "clip" a person into a common range? I still can't figure out a situation in which it can be used. I am confused about it because I saw some pose estimation model used Poses_D3_Positions_mono_universal instead of using Poses_D3_Positions or Poese_D3_Positions_mono.

anibali commented 1 year ago

It depends on what you are trying to achieve. If you are just trying to recover the configuration of joints to e.g. recognise an action, then you might not care about scale differences and could therefore use Poses_D3_Positions_mono_universal to simplify the problem. The scale/depth ambiguity problem is tricky, and not everyone is interested in solving it when doing pose estimation.

poincarelee commented 1 year ago

emm, you are right, it does make sense to ignore scale when considering pose estimation. I think I get a better understanding about it. Thanks. One more question, Poses_D3_Positions_mono_universal is also in camera coordinate system like Poses_D3_Positions_mono or in world coordinate system?

anibali commented 1 year ago

"mono" in the name means camera coordinate system (i.e., monocular pose estimation).