Evaluating H3.6M using AlphaPose 2D keypoints

Hi. I'm trying to evaluate GAST-Net (which is based on VideoPose3D code for H36M evaluation) on H3.6M using AlphaPose key points instead of the projected ones.

I obtain the following really bad results:

----SittingDown----
Test time augmentation: True
Protocol #1 Error (MPJPE): 109.81539394582035 mm
Protocol #2 Error (P-MPJPE): 81.5790513476444 mm
----------
----WalkDog----
Test time augmentation: True
Protocol #1 Error (MPJPE): 85.18789114739596 mm
Protocol #2 Error (P-MPJPE): 63.32294484642188 mm
----------
----Phoning----
Test time augmentation: True
Protocol #1 Error (MPJPE): 94.32172714396947 mm
Protocol #2 Error (P-MPJPE): 75.2252612896446 mm
----------
----Eating----
Test time augmentation: True
Protocol #1 Error (MPJPE): 79.03888174284852 mm
Protocol #2 Error (P-MPJPE): 62.012080773606854 mm
----------
----Posing----
Test time augmentation: True
Protocol #1 Error (MPJPE): 85.71774713704517 mm
Protocol #2 Error (P-MPJPE): 59.543910476567255 mm
----------
----Waiting----
Test time augmentation: True
Protocol #1 Error (MPJPE): 79.52275181767241 mm
Protocol #2 Error (P-MPJPE): 58.82060956531044 mm
----------
----Photo----
Test time augmentation: True
Protocol #1 Error (MPJPE): 112.12336735211704 mm
Protocol #2 Error (P-MPJPE): 68.1798894350466 mm
----------
----Purchases----
Test time augmentation: True
Protocol #1 Error (MPJPE): 98.82641296185878 mm
Protocol #2 Error (P-MPJPE): 61.21304923789981 mm
----------
----Discussion----
Test time augmentation: True
Protocol #1 Error (MPJPE): 86.91998937275055 mm
Protocol #2 Error (P-MPJPE): 64.81118649652308 mm
----------
----Greeting----
Test time augmentation: True
Protocol #1 Error (MPJPE): 90.65123006594138 mm
Protocol #2 Error (P-MPJPE): 65.29128145077685 mm
----------
----Directions----
Test time augmentation: True
Protocol #1 Error (MPJPE): 81.82941271146767 mm
Protocol #2 Error (P-MPJPE): 61.55316470919744 mm
----------
----Sitting----
Test time augmentation: True
Protocol #1 Error (MPJPE): 109.51219947965483 mm
Protocol #2 Error (P-MPJPE): 85.26228932241311 mm
----------
----WalkTogether----
Test time augmentation: True
Protocol #1 Error (MPJPE): 75.70131148890233 mm
Protocol #2 Error (P-MPJPE): 56.5659265036648 mm
----------
----Smoking----
Test time augmentation: True
Protocol #1 Error (MPJPE): 87.77516446494707 mm
Protocol #2 Error (P-MPJPE): 66.87202602592835 mm
----------
----Walking----
Test time augmentation: True
Protocol #1 Error (MPJPE): 71.56626386441941 mm
Protocol #2 Error (P-MPJPE): 55.35812360586239 mm
----------
Protocol #1   (MPJPE) action-wise average: 89.9 mm
Protocol #2 (P-MPJPE) action-wise average: 65.7 mm

I've taken AlphaPose 2D keypoints, put in a structure (subjects-actions-cameras) as required and I've converted them from COCO to H3.6M format. In the end, I input them into the evaluation script. If I compare AlphaPose 2D keypoints with the ones obtained from the projection of 3D points, I obtain a mean difference of 5-6, so I think that the problem is the prediction on the depth dimension. Am I doing something wrong? What should I try to obtain better results?

facebookresearch / VideoPose3D

Evaluating H3.6M using AlphaPose 2D keypoints #189