xiyichen / smplify-x-partial

Towards Robust 3D Body Mesh Inference of Partially-observed Humans: semester project at ETH Zurich
Other
59 stars 4 forks source link
3d-mesh-generation 3d-reconstruction computer-vision human-pose-estimation mesh-reconstruction smplifyx smplx virtual-humans

Towards Robust 3D Body Mesh Inference of Partially-observed Humans

[Report] [Slides]

Qualitative evaluation on video captures of partially-observed humans in Star Trek: The Next Generation. From left to right: input image, PIXIE, ExPose, ours.

\ Demo on one of the test videos in the EgoBody dataset.

Description

This repository contains the fitting and evaluation code used for the experiments in Towards Robust 3D Body Mesh Inference of Partially-observed Humans.

Keypoints Blending

OpenPose BODY_25 format MMPose Halpe format Blending

We perform confidence calibration to blend keypoint detection results from two detectors: OpenPose BODY_25 format and MMPose Halpe format. The per-keypoint heuristics on the SHHQ dataset as mentioned in the paper can be downloaded here.

We provide a colab notebook for keypoints blending and visualization: Open In Colab

Dependencies

Follow the installation instructions for each of the following before using the fitting code. For some of the components, please install our modified version that have been adapted and tested for our optimization pipeline.

  1. PyTorch
  2. SMPL-X
  3. VPoser
  4. Trimesh for loading triangular meshes
  5. Pyrender for visualization

Optional Dependencies

  1. PyTorch Mesh self-intersection for interpenetration penalty
  2. Homogenus for gender classification
  3. ExPose to use its predictions as prior / initialization
  4. PIXIE to use its predictions as prior / initialization

Fitting

Run the following command to execute the code:

python smplifyx/main.py --config cfg_files/fit_smplx.yaml 
    --data_folder DATA_FOLDER 
    --output_folder OUTPUT_FOLDER
    --gender GENDER
    --visualize="True/False"
    --model_folder MODEL_FOLDER
    --vposer_ckpt VPOSER_FOLDER
    --interpenetration="True/False"
    --part_segm_fn smplifyx/smplx_parts_segm.pkl
    --save_vertices="True/False"
    --focal_length=FOCAL_LENGTH
    --use_gender_classifier="True/False"
    --homogeneous_ckpt HOMOGENUS_PRETRAINED_MODEL_FOLDER
    --expose_results_directory EXPOSE_RESULTS_FOLDER
    --pixie_results_directory PIXIE_RESULTS_FOLDER
    --regression_prior='combined/ExPose/PIXIE/PARE/None'

where the DATA_FOLDER should contain two subfolders, images, where the images are located, and keypoints, where the OpenPose output should be stored.

If use_gender_classifier is set to True, homogeneous_ckpt should contain the path to the pre-trained model of the gender classifier Homogenus. If it's set to False, a gender flag --gender='male/female/neutral' should be used. The gender predictions aren't very accurate for aggresive truncations and low-resolution images. For such cases, we recommend inputting the gender determined by human.

For aggressive trunctions in video captures or social media images, the focal length approximation mentioned in the paper could also be inaccurate. In such cases, it is recommended to test different focal length values.

If you would like to use the combined body prior as proposed in the paper, you need to set expose_results_directory as the directory of ExPose prediction results and pixie_results_directory as the directory of PIXIE prediction results.

Two samples from the cropped EHF dataset with blended keypoints, ExPose and PIXIE prediction results are provided here that allows you to reproduce the results in our paper. Disclaimer: the EHF dataset is for research purpose only. The entire dataset can be downloaded here after a regrestration.

We provide a colab notebook with all required dependencies for fitting: Open In Colab

Evaluation

Qualitative evaluation on the cropped EHF dataset. From left to right: input image, PARE, PIXIE, ExPose, ours.

Heatmap comparison on the SMPL-X body model using PA-V2V metric for PIXIE, ExPose, and ours, averaged on all 100 images in the cropped EHF dataset.

To perform quantitative evaluation on the cropped EHF dataset, check out smplifyx/eval.py. The data required for evaluation, including vertex indices for different body parts, bounding boxes we used to crop the EHF dataset, and the weights for vertex-to-14-joints regressor can be downloaded here.

Acknowledgement & Citation

This work is a Master's semester project at Computer Vision and Learning Group (VLG), ETH Zurich by Xiyi Chen, supervised by Dr. Sergey Prokudin. The code is built on SMPLify-X. If you find this work useful for your research, please consider citing:

@misc{smplify_x_partial,
  title = {Towards Robust 3D Body Mesh Inference of Partially-observed Humans},
  howpublished = {\url{https://github.com/xiyichen/smplify-x-partial}},
  author = {Chen, Xiyi},
}

@inproceedings{SMPL-X:2019,
  title = {Expressive Body Capture: 3D Hands, Face, and Body from a Single Image},
  author = {Pavlakos, Georgios and Choutas, Vasileios and Ghorbani, Nima and Bolkart, Timo and Osman, Ahmed A. A. and Tzionas, Dimitrios and Black, Michael J.},
  booktitle = {Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)},
  year = {2019}
}