mks0601 / Hand4Whole_RELEASE

Official PyTorch implementation of "Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation", CVPRW 2022 (Oral.)
MIT License
314 stars 31 forks source link

Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation (Hand4Whole codes)

High-resolution video link: here

Introduction

This repo is official PyTorch implementation of [Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation (CVPRW 2022 Oral.)](https://arxiv.org/abs/2011.11534). This repo contains whole-body codes. For the body-only, hand-only, and face-only codes, visit here.

Quick demo

Directory

Root

The ${ROOT} is described as below.

${ROOT}  
|-- data  
|-- demo
|-- main  
|-- tool
|-- output  
|-- common
|   |-- utils
|   |   |-- human_model_files
|   |   |   |-- smpl
|   |   |   |   |-- SMPL_NEUTRAL.pkl
|   |   |   |-- smplx
|   |   |   |   |-- MANO_SMPLX_vertex_ids.pkl
|   |   |   |   |-- SMPL-X__FLAME_vertex_ids.npy
|   |   |   |   |-- SMPLX_NEUTRAL.pkl
|   |   |   |   |-- SMPLX_to_J14.pkl
|   |   |   |-- mano
|   |   |   |   |-- MANO_LEFT.pkl
|   |   |   |   |-- MANO_RIGHT.pkl
|   |   |   |-- flame
|   |   |   |   |-- flame_dynamic_embedding.npy
|   |   |   |   |-- flame_static_embedding.pkl
|   |   |   |   |-- FLAME_NEUTRAL.pkl

Data

You need to follow directory structure of the data as below.

${ROOT}  
|-- data  
|   |-- AGORA
|   |   |-- data
|   |   |   |-- AGORA_train.json
|   |   |   |-- AGORA_validation.json
|   |   |   |-- AGORA_test_bbox.json
|   |   |   |-- images_1280x720
|   |   |   |-- images_3840x2160
|   |   |   |-- smplx_params_cam
|   |   |   |-- cam_params
|   |-- EHF
|   |   |-- data
|   |   |   |-- EHF.json
|   |-- Human36M  
|   |   |-- images  
|   |   |-- annotations  
|   |-- MPII
|   |   |-- data
|   |   |   |-- images
|   |   |   |-- annotations
|   |-- MPI_INF_3DHP
|   |   |-- data
|   |   |   |-- images_1k
|   |   |   |-- MPI-INF-3DHP_1k.json
|   |   |   |-- MPI-INF-3DHP_camera_1k.json
|   |   |   |-- MPI-INF-3DHP_joint_3d.json
|   |   |   |-- MPI-INF-3DHP_SMPL_NeuralAnnot.json
|   |-- MSCOCO  
|   |   |-- images  
|   |   |   |-- train2017  
|   |   |   |-- val2017  
|   |   |-- annotations 
|   |-- PW3D
|   |   |-- data
|   |   |   |-- 3DPW_train.json
|   |   |   |-- 3DPW_validation.json
|   |   |   |-- 3DPW_test.json
|   |   |-- imageFiles

Output

You need to follow the directory structure of the output folder as below.

${ROOT}  
|-- output  
|   |-- log  
|   |-- model_dump  
|   |-- result  
|   |-- vis  

Running Hand4Whole

Train

The training consists of three stages.

1st: pre-train Hand4Whole

In the main folder, run

python train.py --gpu 0-3 --lr 1e-4 --continue

to train Hand4Whole on the GPU 0,1,2,3. --gpu 0,1,2,3 can be used instead of --gpu 0-3. To train Hand4Whole from the pre-trained 2D human pose estimation network, download this and place it at tool. Then, run python convert_simple_to_pose2pose.py, which produces snapshot_0.pth.tar. Finally, place snapshot_0.pth.tar to output/model_dump.

2nd: pre-train hand-only Pose2Pose

Download pre-trained hand-only Pose2Pose from here. Place the hand-only Pose2Pose to tool/snapshot_12_hand.pth.tar. Also, place the pre-trained Hand4Whole of the first stage to tool/snapshot_6_all.pth.tar. Then, go to tool folder and run python merge_hand_to_all.py. Place the generated snapshot_0.pth.tar to output/model_dump.

Or, you can pre-train hand-only Pose2Pose by yourself. Switch to Pose2Pose branch and train hand-only Pose2Pose on MSCOCO, FreiHAND, InterHand2.6M.

3rd: combine pre-trained Hand4Whole and hand-only Pose2Pose and fine-tune it

Move snapshot_6.pth.tar of the 1st stage to tool/snapshot_6_all.pth.tar. Then, move snapshot_12.pth.tar of the 2nd stage to tool/snapshot_12_hand.pth.tar. Run python merge_hand_to_all.py at the tool folder. Move generated snapshot_0.pth.tar to output/model_dump. In the main folder, run

python train.py --gpu 0-3 --lr 1e-5 --continue

to train Hand4Whole on the GPU 0,1,2,3. --gpu 0,1,2,3 can be used instead of --gpu 0-3.

Test

Place trained model at the output/model_dump/.

In the main folder, run

python test.py --gpu 0-3 --test_epoch 6

to test Hand4Whole on the GPU 0,1,2,3 with60th epoch trained model. --gpu 0,1,2,3 can be used instead of --gpu 0-3.

Models

Results

3D whole-body results

3D body-only and hand-only results

For the 3D body-only and hand-only codes, visit here.

Troubleshoots

Reference

@InProceedings{Moon_2022_CVPRW_Hand4Whole,  
author = {Moon, Gyeongsik and Choi, Hongsuk and Lee, Kyoung Mu},  
title = {Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation},  
booktitle = {Computer Vision and Pattern Recognition Workshop (CVPRW)},  
year = {2022}  
}