lmb-freiburg / hand3d

Network estimating 3D Handpose from single color images
GNU General Public License v2.0
802 stars 252 forks source link
cnn deep-learning hand-pose-estimation iccv pose-estimation tensorflow

ColorHandPose3D network

Teaser

ColorHandPose3D is a Convolutional Neural Network estimating 3D Hand Pose from a single RGB Image. See the project page for the dataset used and additional information.

Usage: Forward pass

The network ships with a minimal example, that performs a forward pass and shows the predictions.

You can compare your results to the content of the folder "results", which shows the predictions we get on our system.

Recommended system

Recommended system (tested):

Python packages used by the example provided and their recommended version:

Preprocessing for training and evaluation

In order to use the training and evaluation scripts you need download and preprocess the datasets.

Rendered Hand Pose Dataset (RHD)

Stereo Tracking Benchmark Dataset (STB)

Network training

We provide scripts to train HandSegNet and PoseNet on the Rendered Hand Pose Dataset (RHD). In case you want to retrain the networks on new data you can adapt the code provided to your needs.

The following steps guide you through training HandSegNet and PoseNet on the Rendered Hand Pose Dataset (RHD).

You should be able to obtain results that roughly match the following numbers we obtain with Tensorflow v1.3:

eval2d_gt_cropped.py yields:

Evaluation results:
Average mean EPE: 7.630 pixels
Average median EPE: 3.939 pixels
Area under curve: 0.771

eval2d.py yields:

Evaluation results:
Average mean EPE: 15.469 pixels
Average median EPE: 4.374 pixels
Area under curve: 0.715

Because training itself isn't a deterministic process results will differ between runs. Note that these results are not listed in the paper.

Evaluation

There are four scripts that evaluate different parts of the architecture:

  1. eval2d_gt_cropped.py: Evaluates PoseNet on 2D keypoint localization using ground truth annoation to create hand cropped images (section 6.1, Table 1 of the paper)
  2. eval2d.py: Evaluates HandSegNet and PoseNet on 2D keypoint localization (section 6.1, Table 1 of the paper)
  3. eval3d.py: Evaluates different approaches on lifting 2D predictions into 3D (section 6.2.1, Table 2 of the paper)
  4. eval3d_full.py: Evaluates our full pipeline on 3D keypoint localization from RGB (section 6.2.1, Table 2 of the paper)

This provides the possibility to reproduce results from the paper that are based on the RHD dataset.

License and Citation

This project is licensed under the terms of the GPL v2 license. By using the software, you are agreeing to the terms of the license agreement.

Please cite us in your publications if it helps your research:

@InProceedings{zb2017hand,
  author    = {Christian Zimmermann and Thomas Brox},
  title     = {Learning to Estimate 3D Hand Pose from Single RGB Images},
  booktitle    = "IEEE International Conference on Computer Vision (ICCV)",
  year      = {2017},
  note         = "https://arxiv.org/abs/1705.01389",
  url          = "https://lmb.informatik.uni-freiburg.de/projects/hand3d/"
}

Known issues