Vegetebird / StridedTransformer-Pose3D

[TMM 2022] Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation
MIT License
335 stars 37 forks source link

How to use the refine network in inference? #3

Closed tempstudio closed 2 years ago

tempstudio commented 2 years ago

The code for testing uses the GT 3D as well as camera parameters in the refine step. I got the normal step working, but with a wild input, what should I feed into the refine network?

Vegetebird commented 2 years ago

Hi~With a wild input, you can predict results without the refine network due to the lack of camera parameters.

Vegetebird commented 2 years ago

Hi~ @tempstudio We have released the demo and in-the-wild inference code. Plaese download the latest repo and refer to demo.

speed8928 commented 2 years ago

The figure which reported in the paper with P1: 43.7, P2: 36.8 is only achieved by loading the refine model and it takes camera parameters, where the other the state of the arts didn't apply this additional information. Can you please provide the code without camera parameters inputs so that it can be more fair comparison to others?

Vegetebird commented 2 years ago

The repo and training strategy follow [1,2] (ST-GCN, MGCN). You can use 'python main.py' to train a model without camera parameters, which does not use the refine model.

[1] Cai et al. Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks, ICCV 2019. [2] Zou et al. Modulated Graph Convolutional Network for 3D Human Pose Estimation, ICCV 2021.