TaatiTeam / MotionAGFormer

Official implementation of the paper "MotionAGFormer: Enhancing 3D Pose Estimation with a Transformer-GCNFormer Network" (WACV 2024).
Apache License 2.0
140 stars 17 forks source link

Questions on Training #12

Closed liuxing007 closed 9 months ago

liuxing007 commented 9 months ago

I appreciate for your great work! I have some questions on Training, please give us some help ,thx!

  1. Excuse me, we also need guidance on how to get and preprocess the 2D Groud Truth data for Human3.6M. Could you share the code?
  2. How to calculate MACs/frame and Param? My email: liuxing@sz.pku.edu.cn
SoroushMehraban commented 9 months ago

Thanks @liuxing007

  1. A proper way of getting it is by using the camera intrinsic parameters to project 3D camera coordinates to 2D pixel coordinates, as done here.
    • For this project though, we followed what MotionBERT did and simply took the (X, Y) from the (X, Y, Z) as the 2D ground truth (You can see it here)
    • The reason why that should be fine is that MotionAGFormer takes the normalized 2D pose sequence as input and it is in range [-1, 1]. The 3D pose sequence is already normalized (and rescaled to be the same scale as the 2D pose as I explained here) and as a result doing it is equivalent to convert it to 2D pose sequence in an standard way and then normalizing it.
  2. For params I used the function implemented here. For MACs/frame you can use a library such as torchprofile that computes the MACs for the whole model. Then simply you have to divide the number for all the output frames to have MACs/frame.