PyTorch re-implementation of the V2V-PoseNet is available at dragonbook's repo.
Thanks dragonbook for re-implementation. Other versions of the V2V-PoseNet are also welcome!
This is our project repository for the paper, V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map (CVPR 2018).
We, Team SNU CVLAB, (Gyeongsik Moon, Juyong Chang, and Kyoung Mu Lee of Computer Vision Lab, Seoul National University) are winners of HANDS2017 Challenge on frame-based 3D hand pose estimation.
Please refer to our paper for details.
If you find our work useful in your research or publication, please cite our work:
[1] Moon, Gyeongsik, Ju Yong Chang, and Kyoung Mu Lee. "V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map." CVPR 2018. [arXiv]
@InProceedings{Moon_2018_CVPR_V2V-PoseNet,
author = {Moon, Gyeongsik and Chang, Juyong and Lee, Kyoung Mu},
title = {V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2018}
}
In this repository, we provide
Our code is tested under Ubuntu 14.04 and 16.04 environment with Titan X GPUs (12GB VRAM).
Clone this repository into any place you want. You may follow the example below.
makeReposit = [/the/directory/as/you/wish]
mkdir -p $makeReposit/; cd $makeReposit/
git clone https://github.com/mks0601/V2V-PoseNet_RELEASE.git
src
folder contains lua script files for data loader, trainer, tester and other utilities.data
folder contains data converter which converts image files to the binary files.To train our model, please run the following command in the src
directory:
th rum_me.lua
.png
images of the ICVL, NYU, and HANDS 2017 dataset to the .bin
files by running the code from data
folder.src/data/dataset_name/data.lua
We trained and tested our model on the four 3D hand pose estimation and one 3D human pose estimation datasets.
Here we provide the precomputed centers, estimated 3D coordinates and pre-trained models of ICVL, NYU, MSRA, HANDS2017, and ITOP datasets. You can download precomputed centers and 3D hand pose results in here and pre-trained models in here
The precomputed centers are obtained by training the hand center estimation network from DeepPrior++ . Each line represents 3D world coordinate of each frame. In case of ICVL, NYU, MSRA, and HANDS2017 dataset, if depth map not exist or not contain hand, that frame is considered as invalid. In case of ITOP dataset, if 'valid' variable of a certain frame is false, that frame is considered as invalid. All test images are considered as valid.
The 3D coordinates estimated on the ICVL, NYU and MSRA datasets are pixel coordinates and the 3D coordinates estimated on the HANDS2017 and ITOP datasets are world coordinates. The estimated results are from ensembled model. You can make the results from a single model by downloading the pre-trained model and testing it.
We used awesome-hand-pose-estimation to evaluate the accuracy of the V2V-PoseNet on the ICVL, NYU and MSRA dataset.
Belows are qualitative results.