google-deepmind / Temporal-3D-Pose-Kinetics

Exploiting temporal context for 3D human pose estimation in the wild: 3D poses for the Kinetics dataset
http://arxiv.org/abs/1905.04266
Apache License 2.0
219 stars 27 forks source link

Exploiting temporal context for 3D human pose estimation in the wild

Exploiting temporal context for 3D human pose estimation in the wild uses temporal information from videos to correct errors in single-image 3D pose estimation. In this repository, we provide results from applying this algorithm on the Kinetics-400 dataset. Note that this is not an exhaustive labeling: at most one person is labeled per frame, and frames which the algorithm has identified as outliers are not labeled.

The archive contains a single .pkl file for each video where bundle adjustment succeeded. Let N be the number of frames that the algorithm considers inliers. Then the .pkl file contains a map with the following keys:

The dataset can be downloaded here (325 GB), as well as an significantly smaller archive which does not contain vertices, but is otherwise identical, here (2.7 GB).

Joint regressor

We also have a custom joint regressor that is specific to our pose estimator (since there are slight differences between the 2D joints we used for bundle adjustment and those used for SMPL). This is a 6890x19 array that can be used as a drop-in replacement for the cocoplus_regressor that is distributed in the public HMR repository, and is required to extract the 3d_keypoints above from the estimated poses. It was learned using ground-truth from the Human3.6m dataset.

Pretrained Model

This Tensorflow checkpoint was trained using the procedure outlined in our paper. That is, it uses the above dataset as well as standard HMR 3D data. The checkpoint is compatible with HMR.

Visualising data

To run the demo:

python run_visualise --filename <path_to_downloaded_pickle_file>

Credits

Reference

If you use this data, please cite

@InProceedings{Arnab_CVPR_2019,
    author = {Arnab, Anurag* and 
              Doersch, Carl* and 
              Zisserman, Andrew},
    title = {Exploiting temporal context for 3D human pose estimation in the wild},
    booktitle = {Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2019}
}