Closed joonjeon closed 1 year ago
Hi, we (as well as many other works) use 2D pose estimation results for 3D pose estimation. If you want to estimate 3D pose from images directly, you can check the following works:
Hi! @Walter0807 Thanks for your great work and your detailed explanation above! But I'm still confused about the evaluation results of hybrid approaches in Table.3, such as "MotionBERT"+"SPIN", "MAED" or "HybrIK". The input format of these settings are videos to require shape infomation, while the ground truth of 2D keypoints in test set are still need in pose estimation, isn't it?
I was able to successfully measure quality-related metrics after setting up as shown here: https://github.com/Walter0807/MotionBERT/blob/main/docs/pose3d.md#running
However, this gives me one question: If MotionBERT requires off-the-self 2D pose estimation results before deploying DSTFormer for 2D-to-3D lifting, how is the procedure able to compute quality-related metrics of MotionBERT even if it does not explicitly intake 2D pose inputs?