This is the code for the paper
Julieta Martinez, Rayat Hossain, Javier Romero, James J. Little. A simple yet effective baseline for 3d human pose estimation. In ICCV, 2017. https://arxiv.org/pdf/1705.03098.pdf.
The code in this repository was mostly written by Julieta Martinez, Rayat Hossain and Javier Romero.
We provide a strong baseline for 3d human pose estimation that also sheds light on the challenges of current approaches. Our model is lightweight and we strive to make our code transparent, compact, and easy-to-understand.
Watch our video: https://youtu.be/Hmi3Pd9x1BE
Clone this repository
git clone https://github.com/una-dinosauria/3d-pose-baseline.git
cd 3d-pose-baseline
mkdir -p data/h36m/
Go to http://vision.imar.ro/human3.6m/, log in, and download the D3 Positions
files for subjects [1, 5, 6, 7, 8, 9, 11]
,
and put them under the folder data/h36m
. Your directory structure should look like this
src/
README.md
LICENCE
...
data/
└── h36m/
├── Poses_D3_Positions_S1.tgz
├── Poses_D3_Positions_S11.tgz
├── Poses_D3_Positions_S5.tgz
├── Poses_D3_Positions_S6.tgz
├── Poses_D3_Positions_S7.tgz
├── Poses_D3_Positions_S8.tgz
└── Poses_D3_Positions_S9.tgz
Now, move to the data folder, and uncompress all the data
cd data/h36m/
for file in *.tgz; do tar -xvzf $file; done
Finally, download the code-v1.2.zip
file, unzip it, and copy the metadata.xml
file under data/h36m/
Now, your data directory should look like this:
data/
└── h36m/
├── metadata.xml
├── S1/
├── S11/
├── S5/
├── S6/
├── S7/
├── S8/
└── S9/
There is one little fix we need to run for the data to have consistent names:
mv h36m/S1/MyPoseFeatures/D3_Positions/TakingPhoto.cdf \
h36m/S1/MyPoseFeatures/D3_Positions/Photo.cdf
mv h36m/S1/MyPoseFeatures/D3_Positions/TakingPhoto\ 1.cdf \
h36m/S1/MyPoseFeatures/D3_Positions/Photo\ 1.cdf
mv h36m/S1/MyPoseFeatures/D3_Positions/WalkingDog.cdf \
h36m/S1/MyPoseFeatures/D3_Positions/WalkDog.cdf
mv h36m/S1/MyPoseFeatures/D3_Positions/WalkingDog\ 1.cdf \
h36m/S1/MyPoseFeatures/D3_Positions/WalkDog\ 1.cdf
And you are done!
Please note that we are currently not supporting SH detections anymore, only training from GT 2d detections is possible now.
For a quick demo, you can train for one epoch and visualize the results. To train, run
python src/predict_3dpose.py --camera_frame --residual --batch_norm --dropout 0.5 --max_norm --evaluateActionWise --epochs 1
This should take about <5 minutes to complete on a GTX 1080, and give you around 56 mm of error on the test set.
Now, to visualize the results, simply run
python src/predict_3dpose.py --camera_frame --residual --batch_norm --dropout 0.5 --max_norm --evaluateActionWise --epochs 1 --sample --load 24371
This will produce a visualization similar to this:
To train a model with clean 2d detections, run:
python src/predict_3dpose.py --camera_frame --residual --batch_norm --dropout 0.5 --max_norm --evaluateActionWise
This corresponds to Table 2, bottom row. Ours (GT detections) (MA)
If you use our code, please cite our work
@inproceedings{martinez_2017_3dbaseline,
title={A simple yet effective baseline for 3d human pose estimation},
author={Martinez, Julieta and Hossain, Rayat and Romero, Javier and Little, James J.},
booktitle={ICCV},
year={2017}
}
MIT