Check our YouTube videos below for more details.
Paper Video | Qualitative Results |
---|---|
VIBE: Video Inference for Human Body Pose and Shape Estimation,
Muhammed Kocabas, Nikos Athanasiou, Michael J. Black,
IEEE Computer Vision and Pattern Recognition, 2020
Video Inference for Body Pose and Shape Estimation (VIBE) is a video pose and shape estimation method. It predicts the parameters of SMPL body model for each frame of an input video. Pleaser refer to our arXiv report for further details.
This implementation:
VIBE has been implemented and tested on Ubuntu 18.04 with python >= 3.7. It supports both GPU and CPU inference. If you don't have a suitable device, try running our Colab demo.
Clone the repo:
git clone https://github.com/mkocabas/VIBE.git
Install the requirements using virtualenv
or conda
:
# pip
source scripts/install_pip.sh
# conda
source scripts/install_conda.sh
We have prepared a nice demo code to run VIBE on arbitrary videos. First, you need download the required data(i.e our trained model and SMPL model parameters). To do this you can just run:
source scripts/prepare_data.sh
Then, running the demo is as simple as:
# Run on a local video
python demo.py --vid_file sample_video.mp4 --output_folder output/ --display
# Run on a YouTube video
python demo.py --vid_file https://www.youtube.com/watch?v=wPZP8Bwxplo --output_folder output/ --display
Refer to doc/demo.md
for more details about the demo code.
Sample demo output with the --sideview
flag:
We provide a script to convert VIBE output to standalone FBX/glTF files to be used in 3D graphics tools like Blender, Unity etc. You need to follow steps below to be able to run the conversion script.
data/SMPL_unity_v.1.0.0
.
python lib/utils/fbx_output.py \
--input output/sample_video/vibe_output.pkl \
--output output/sample_video/fbx_output.fbx \ # specify the file extension as *.glb for glTF
--fps_source 30 \
--fps_target 30 \
--gender <male or female> \
--person_id <tracklet id from VIBE output>
### Windows Installation Tutorial
You can follow the instructions provided by [@carlosedubarreto](https://github.com/carlosedubarreto) to install and run VIBE on a Windows machine:
- VIBE windows installation tutorial: https://youtu.be/3qhs5IRJ1LI
- FBX conversion: https://youtu.be/w1biKeiQThY
- Helper github repo: https://github.com/carlosedubarreto/vibe_win_install
## Google Colab
If you do not have a suitable environment to run this project then you could give Google Colab a try.
It allows you to run the project in the cloud, free of charge. You may try our Colab demo using the notebook we have prepared:
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1dFfwxZ52MN86FA6uFNypMEdFShd2euQA)
## Training
Run the commands below to start training:
```shell script
source scripts/prepare_training_data.sh
python train.py --cfg configs/config.yaml
Note that the training datasets should be downloaded and prepared before running data processing script.
Please see doc/train.md
for details on how to prepare them.
Here we compare VIBE with recent state-of-the-art methods on 3D pose estimation datasets. Evaluation metric is Procrustes Aligned Mean Per Joint Position Error (PA-MPJPE) in mm.
Models | 3DPW ↓ | MPI-INF-3DHP ↓ | H36M ↓ |
---|---|---|---|
SPIN | 59.2 | 67.5 | 41.1 |
Temporal HMR | 76.7 | 89.8 | 56.8 |
VIBE | 56.5 | 63.4 | 41.5 |
See doc/eval.md
to reproduce the results in this table or
evaluate a pretrained model.
Correction: Due to a mistake in dataset preprocessing, VIBE trained with 3DPW results in Table 1 of the original paper are not correct. Besides, even though training with 3DPW guarantees better quantitative performance, it does not give good qualitative results. ArXiv version will be updated with the corrected results.
@inproceedings{kocabas2019vibe,
title={VIBE: Video Inference for Human Body Pose and Shape Estimation},
author={Kocabas, Muhammed and Athanasiou, Nikos and Black, Michael J.},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}
This code is available for non-commercial scientific research purposes as defined in the LICENSE file. By downloading and using this code you agree to the terms in the LICENSE. Third-party datasets and software are subject to their respective licenses.
We indicate if a function or script is borrowed externally inside each file. Here are some great resources we benefit: