sanweiliti / RoHM

The official PyTorch code for RoHM: Robust Human Motion Reconstruction via Diffusion.
https://sanweiliti.github.io/ROHM/ROHM.html
Other
309 stars 14 forks source link
3d-human-shape-and-pose-estimation 3d-vision diffusion human-mesh-recovery human-motion-reconstruction motion-prior

RoHM

Robust Human Motion Reconstruction via Diffusion

Project Page | Paper

RoHM is a novel diffusion-based motion model that, conditioned on noisy and occluded input data, reconstructs complete, plausible motions in consistent global coordinates. -- we decompose it into two sub-tasks and learn two models, one for global trajectory and one for local motion. To capture the correlations between the two, we then introduce a novel conditioning module, combining it with an iterative inference scheme.

Installation

Creating a clean conda environment and install all dependencies by:

conda env create -f environment.yml

After the installation is complete, activate the conda environment by:

conda activate rohm

Data preparation

AMASS

PROX

Download the following contents for PROX dataset:

EgoBody

Download the following contents for EgoBody dataset:

SMPL-X body model

Download SMPL-X body model from here. Note that the latest version is 1.1 while we use 1.0 in the implementation.

Download smplx vertices segmentation smplx_vert_segmentation.json from here.

Other data (checkpoints, results, etc.)

Download the model checkpoints from here. Download other processed/saved data from here and unzip, including:

Organize all downloaded data as below:

RoHM
├── data
│   ├── body_models
│   │   ├── smplx_model
│   │   │   ├── smplx
│   ├── checkpoints
│   ├── eval_noise_smplx
│   ├── init_motions
│   ├── test_results_release
│   ├── smplx_vert_segmentation.json
├── datasets
│   ├── AMASS_smplx_preprocessed
│   ├── PROX
│   ├── EgoBody

Training

RoHM is trained on AMASS dataset.

TrajNet Training

Train the vanilla TrajNet with a curriculum training scheme for three stages, with increasing noise ratios:

python train_trajnet.py --config=cfg_files/train_cfg/trajnet_train_vanilla_stage1.yaml 
python train_trajnet.py --config=cfg_files/train_cfg/trajnet_train_vanilla_stage2.yaml --pretrained_model_path=PATH/TO/MODEL
python train_trajnet.py --config=cfg_files/train_cfg/trajnet_train_vanilla_stage3.yaml --pretrained_model_path=PATH/TO/MODEL

For stage 2 and 3, set pretrained_model_path to the trained checkpoint from the previous stage. To obtain the reported checkpoint, we train for 800k/400k/450k steps for stage 1/2/3, respectively.

TrajNet fine-tuning with TrajControl:

python train_trajnet.py --config=cfg_files/train_cfg/trajnet_ft_trajcontrol.yaml --pretrained_backbone_path=PATH/TO/MODEL

Set pretrained_backbone_path to the pre-trained checkpoint of vanilla TrajNet, and we train for 400k to obtain the reported checkpoint.

PoseNet training

Train PoseNet with a curriculum training scheme for two stages, with increasing noise ratios:

python train_posenet.py --config=cfg_files/train_cfg/posenet_train_stage1.yaml
python train_posenet.py --config=cfg_files/train_cfg/posenet_train_stage2.yaml --pretrained_model_path=PATH/TO/MODEL

For stage 2, set pretrained_model_path to the trained checkpoint from the previous stage. To obtain the reported checkpoint, we train for 300k/200k steps for stage 1/2, respectively.

Test and evaluate on AMASS

Test on AMASS

Test on AMASS with different configurations (corresponds to Tab.1 in the paper) and save reconstructed results to test_results/results_amass_full: Note that running the given configurations with the same random seed cannot guarantee exactly the same number across different machines, however the stochasticity is quite small.

Evaluate on AMASS

Calculate the evaluation metrics and visualize/render on reconstructed results on AMASS.

Other flags for visualization and rendering:

Test and evaluate on PROX/EgoBody

Correponds to the experiment setups in Tab.2 and Tab.3 in the paper.

Initialization

To obtain the initial (noisy and partially visible) motions on PROX, we use the following options:

We provide our preprocessed initial motion sequence in the folder data/init_motions, and the final output motion sequences from RoHM in the folder data/test_results_release for your reference.

Note that for the following scripts, the intial motions should have z-axis up for PROX, and y-axis up for EgoBody.

Test on PROX/EgoBody

Evaluate on PROX/EgoBody

Calculate the evaluation metrics and visualize/render on reconstructed results on PROX/EgoBody.

Other flags for visualization and rendering:

Customized Input

If you want to run RoHM on your customized input:

License

The majority of RoHM is licensed under CC-BY-NC (including the code, released checkpoints, released dataset for initialized / final motion sequences), however portions of the project are available under separate license terms:

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{zhang2024rohm,
   title={RoHM: Robust Human Motion Reconstruction via Diffusion},
   author={Zhang, Siwei and Bhatnagar, Bharat Lal and Xu, Yuanlu and Winkler, Alexander and Kadlecek, Petr and Tang, Siyu and Bogo, Federica},
   booktitle={CVPR},
   year={2024}
 }