facebookresearch / InterHand2.6M

Official PyTorch implementation of "InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image", ECCV 2020
Other
689 stars 91 forks source link

InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image

Our new Re:InterHand dataset has been released, which has much more diverse image appearances with more stable 3D GT. Check it out at here!

Introduction

Above demo videos have low-quality frames because of the compression for the README upload.

News

InterHand2.6M dataset

Demo on a random image

  1. Download pre-trained InterNet from here
  2. Put the model at demo folder
  3. Go to demo folder and edit bbox in here
  4. run python demo.py --gpu 0 --test_epoch 20
  5. You can see result_2D.jpg and 3D viewer.

MANO mesh rendering demo

  1. Install SMPLX
  2. cd tool/MANO_render
  3. Set smplx_path in render.py
  4. Run python render.py

MANO parameter conversion from the world coordinate to the camera coordinate system

  1. Install SMPLX
  2. cd tool/MANO_world_to_camera/
  3. Set smplx_path in convert.py
  4. Run python convert.py

Camera positions visualization demo

  1. cd tool/camera_visualize
  2. Run python camera_visualize.py
    • As there are many cameras, you'd better set subset and split in line 9 and 10, respectively, by yourself.

Directory

Root

The ${ROOT} is described as below.

${ROOT}
|-- data
|-- common
|-- main
|-- output

Data

You need to follow directory structure of the data as below.

${ROOT}
|-- data
|   |-- STB
|   |   |-- data
|   |   |-- rootnet_output
|   |   |   |-- rootnet_stb_output.json
|   |-- RHD
|   |   |-- data
|   |   |-- rootnet_output
|   |   |   |-- rootnet_rhd_output.json
|   |-- InterHand2.6M
|   |   |-- annotations
|   |   |   |-- train
|   |   |   |-- test
|   |   |   |-- val
|   |   |-- images
|   |   |   |-- train
|   |   |   |-- test
|   |   |   |-- val
|   |   |-- rootnet_output
|   |   |   |-- rootnet_interhand2.6m_output_test.json
|   |   |   |-- rootnet_interhand2.6m_output_test_30fps.json
|   |   |   |-- rootnet_interhand2.6m_output_val.json
|   |   |   |-- rootnet_interhand2.6m_output_val_30fps.json

Output

You need to follow the directory structure of the output folder as below.

${ROOT}
|-- output
|   |-- log
|   |-- model_dump
|   |-- result
|   |-- vis

Running InterNet

Start

Train

In the main folder, run

python train.py --gpu 0-3

to train the network on the GPU 0,1,2,3. --gpu 0,1,2,3 can be used instead of --gpu 0-3. If you want to continue experiment, run use --continue.

Test

Place trained model at the output/model_dump/.

In the main folder, run

python test.py --gpu 0-3 --test_epoch 20 --test_set $DB_SPLIT

to test the network on the GPU 0,1,2,3 with snapshot_20.pth.tar. --gpu 0,1,2,3 can be used instead of --gpu 0-3.

$DB_SPLIT is one of [val,test].

Results

Here I provide the performance and pre-trained snapshots of InterNet, and output of the RootNet as well.

Pre-trained InterNet

Reference

@InProceedings{Moon_2020_ECCV_InterHand2.6M,  
author = {Moon, Gyeongsik and Yu, Shoou-I and Wen, He and Shiratori, Takaaki and Lee, Kyoung Mu},  
title = {InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image},  
booktitle = {European Conference on Computer Vision (ECCV)},  
year = {2020}  
}  

License

InterHand2.6M is CC-BY-NC 4.0 licensed, as found in the LICENSE file.

[Terms of Use] [Privacy Policy]