Walter0807 / MotionBERT

[ICCV 2023] PyTorch Implementation of "MotionBERT: A Unified Perspective on Learning Human Motion Representations"
Apache License 2.0
1.01k stars 123 forks source link

What is 2.5d_factor in preprocessed H3.6M data? #51

Closed LareinaM closed 1 year ago

LareinaM commented 1 year ago

Hi, I looked through LCN and all data processing scripts, but none of them ever mentioned 2.5d-related fields. I wonder how is 2.5d_factor in the pkl file is calculated?

Walter0807 commented 1 year ago

I think it's on their T-PAMI version section 6.2.2. Basically, it's just a normalizing factor between the pixel coordinates and the world coordinates.

LareinaM commented 1 year ago

I think it's on their T-PAMI version section 6.2.2. Basically, it's just a normalizing factor between the pixel coordinates and the world coordinates.

Hi @Walter0807, I read through that section, but it actually explains your DataReaderH36M. I wonder how to obtain that factor, as it does not appear in any code scripts, thx!

Walter0807 commented 1 year ago

I think it's on their T-PAMI version section 6.2.2. Basically, it's just a normalizing factor between the pixel coordinates and the world coordinates.

Hi @Walter0807, I read through that section, but it actually explains your DataReaderH36M. I wonder how to obtain that factor, as it does not appear in any code scripts, thx!

Sorry, it should be in Section 6.2.1 (LCN, T-PAMI 2020).

image

The 2.5d factor is the $\lambda$ here.

LareinaM commented 1 year ago

I think it's on their T-PAMI version section 6.2.2. Basically, it's just a normalizing factor between the pixel coordinates and the world coordinates.

Hi @Walter0807, I read through that section, but it actually explains your DataReaderH36M. I wonder how to obtain that factor, as it does not appear in any code scripts, thx!

Sorry, it should be in Section 6.2.1 (LCN, T-PAMI 2020). image The 2.5d factor is the λ here.

Thanks for your reply! I wonder if there is any code related to this part, it's very hard to infer by oneself.

Walter0807 commented 1 year ago

Sorry, I did not implement this part, the processed data was from LCN. If you have the original H36M data, you can estimate the corresponding $\lambda$ as described.

LareinaM commented 1 year ago

Sorry, I did not implement this part, the processed data was from LCN. If you have the original H36M data, you can estimate the corresponding λ as described.

Thanks for replying! From my understanding, in data processing script of lcn, joint_3d_image is P_p and joint_3d_camera is P_c, which should be root-centered and then used to calculate λ. When I tried using the above approach, the result is not the 2.5d_factor.

Walter0807 commented 1 year ago

Sorry, I did not implement this part, the processed data was from LCN. If you have the original H36M data, you can estimate the corresponding λ as described.

Thanks for replying! From my understanding, in data processing script of lcn, joint_3d_image is P_p and joint_3d_camera is P_c, which should be root-centered and then used to calculate λ. When I tried using the above approach, the result is not the 2.5d_factor.

Is the result close or not? I did not go into the code details of LCN, perhaps you can leave an issue there.

LareinaM commented 1 year ago

Sorry, I did not implement this part, the processed data was from LCN. If you have the original H36M data, you can estimate the corresponding λ as described.

Thanks for replying! From my understanding, in data processing script of lcn, joint_3d_image is P_p and joint_3d_camera is P_c, which should be root-centered and then used to calculate λ. When I tried using the above approach, the result is not the 2.5d_factor.

Is the result close or not? I did not go into the code details of LCN, perhaps you can leave an issue there.

非常抱歉如此打扰了,我发现他们的repo从来不解答issue才到此处提问。如果按照上述的理解来minimize,λ会在0.2左右(实际上应该是4.6左右)。所以我以为是理解出现问题

Walter0807 commented 1 year ago

Hi, sorry for the confusion. I consulted the LCN authors and here is the update:

我觉得可能是你算minimize的方式不太一样,可以看一下release版的实现看看是否有帮助。另外是否有可能是比例顺序不同,我注意到两个scale可能是倒数关系。

Walter0807 commented 1 year ago

I think it's on their T-PAMI version section 6.2.2. Basically, it's just a normalizing factor between the pixel coordinates and the world coordinates.

Hi @Walter0807, I read through that section, but it actually explains your DataReaderH36M. I wonder how to obtain that factor, as it does not appear in any code scripts, thx!

Sorry, it should be in Section 6.2.1 (LCN, T-PAMI 2020). image The 2.5d factor is the λ here.

My fault. λ (world to pixel) should be the reciprocal of 2.5d_factor (pixel to world).

LareinaM commented 1 year ago

Thank you so much for your help! I figured it out now.