Parallel-Branch Network for 3D Human Pose and Shape Estimation in Video

comparsion

Features

This implementation:

has the demo and training code for our model implemented purely in PyTorch,
can work on arbitrary videos with multiple people,
achieves excellent results on 3DPW and MPI-INF-3DHP datasets,
includes Temporal SMPLify implementation.
includes the training code and detailed instruction on how to train it from scratch.
can create an FBX/glTF output to be used with major graphics softwares.

Getting Started

Model has been implemented and tested on Ubuntu 18.04 with python >= 3.7. You need a Nvidia GPU.

Clone the repo:

git clone https://github.com/paradoxWu/parallel-branchfor-hpe.git

Install the requirements using virtualenv or conda:

# pip
source scripts/install_pip.sh

# conda
source scripts/install_conda.sh

Training

Run the commands below to start training:

python train.py --cfg configs/config.yaml

Note that the training datasets should be downloaded and prepared before running data processing script. Please see doc/train.md for details on how to prepare them.

Evaluation

Here we compare VIBE with recent state-of-the-art methods on 3D pose estimation datasets. Evaluation metric is Procrustes Aligned Mean Per Joint Position Error (MPJPE) in mm.

Models	3DPW↓	MPI-INF-3DHP↓
SPIN	96.9	105.2
Pose2Mesh	89.2	-
VIBE	93.5	96.6
Ours	85.7	95.8

Models	3DPW↓	MPI-INF-3DHP↓
SPIN	59.2	67.5
Pose2Mesh	58.3	-
VIBE	56.5	63.4
Ours	53.1	65

See doc/eval.md to reproduce the results in this table or evaluate a pretrained model.

Models

network

checkpoints

checkpoint	Google Drive	Baidu Pan
GRU	Google Drive	Baidu
Transformer	TBD	TBD

ToDo

Parameters are too large, lightweight the model.
Try different backbones.
I found the GRU works as well as the transformer, so the sequences in the time axis can be optimized to fit the transformer archetiture.
License

This code is available for non-commercial scientific research purposes as defined in the LICENSE file. By downloading and using this code you agree to the terms in the LICENSE. Third-party datasets and software are subject to their respective licenses.

Citation

How to cite this article:

Wu Y, Wang C. Parallel-branch network for 3D human pose and shape estimation in video. Comput Anim Virtual Worlds. 2022;e2078. https://doi.org/10.1002/cav.2078

References

We indicate if a function or script is borrowed externally inside each file. Here are some great resources we benefit:

Pretrained HMR and some functions are borrowed from SPIN.
SMPL models and layer is from SMPL-X model.
Some functions are borrowed from Temporal HMR.
Some functions are borrowed from HMR-pytorch.
Some functions are borrowed from Kornia.
Pose tracker is from STAF.
Spatial and Temporal transformer modules are set as PoseFormer
Most code are borrowed from VIBE

paradoxWu / parallel-branchfor-hpe

readme