This repository contains the content of the following paper:
ReGenNet: Towards Human Action-Reaction Synthesis
Liang Xu1,2, Yizhou Zhou3, Yichao Yan1, Xin Jin2, Wenhan Zhu, Fengyun Rao3, Xiaokang Yang1, Wenjun Zeng2
1 Shanghai Jiao Tong University 2 Eastern Institute of Technology, Ningbo 3WeChat, Tencent Inc.
First, please clone the repository by the following command:
git clone https://github.com/liangxuy/ReGenNet.git
cd ReGenNet
Setup the environment
We also provide a Dockerfile (docker/Dockerfile) if you want to build your own docker environment.
Download other required files
You can download the pretrained models at Google drive and move them to save
folder to reproduce the results.
You need to download the action recognition models at Google drive and move them to recognition_training
for evaluation.
Download the SMPL neutral models from the SMPL website and the SMPL-X models from the SMPL-X website and then move them to body_models/smpl
and body_models/smplx
. We also provide a copy here for the convenience.
Since the license of NTU RGB+D 120 dataset does not allow us to distribute its data and annotations, we cannot release the processed NTU RGB+D 120 dataset publicly. If someone is interested at the processed data, please email me.
You can download the original dataset here and the actor-reactor order annotations here.
You can also download the processed dataset at Google Drive and put them under the folder of dataset/chi3d
.
You can download the original dataset here and the actor-reactor order annotations here and put them under the folder of dataset/interhuman
.
We provide the script to train the model of the online
and unconstrained
setting for human action-reaction synthesis on the NTU120-AS
dataset. --arch
, --unconstrained
and --dataset
can be customized for different settings.
Training with 1 GPU:
# NTU RGB+D 120 Dataset
python -m train.train_mdm --setting cmdm --save_dir save/cmdm/ntu_smplx --dataset ntu --cond_mask_prob 0 --num_person 2 --layers 8 --num_frames 60 --arch online --overwrite --pose_rep rot6d --body_model smplx --data_path PATH/TO/xsub.train.h5 --train_platform_type TensorboardPlatform --vel_threshold 0.03 --unconstrained
# Chi3D dataset
python -m train.train_mdm --setting cmdm --save_dir save/cmdm/chi3d_smplx --dataset chi3d --cond_mask_prob 0 --num_person 2 --layers 8 --num_frames 150 --arch online --overwrite --pose_rep rot6d --body_model smplx --data_path PATH/TO/chi3d_smplx_train.h5 --train_platform_type TensorboardPlatform --vel_threshold 0.01 --unconstrained
Training with multiple GPUs (4 GPUs in the example):
mpiexec -n 4 --allow-run-as-root python -m train.train_mdm --setting cmdm --save_dir save/cmdm/ntu_smplx --dataset ntu --cond_mask_prob 0 --num_person 2 --layers 8 --num_frames 60 --arch online --overwrite --pose_rep rot6d --body_model smplx --data_path PATH/TO/xsub.train.h5 --train_platform_type TensorboardPlatform --vel_threshold 0.03 --unconstrained
For the action recognition model, you can
Directly download the trained action recognition model here;
Or you can train your action recognition model:
The code of training the action recognition model is based on the ACTOR repository.
The following script will evaluate the trained model of PATH/TO/model_XXXX.pt
, the rec_model_path
is the action recognition model. The results will be written to PATH/TO/evaluation_results_XXXX_full.yaml
. We use ddim5
to accelerate the evaluation process.
python -m eval.eval_cmdm --model PATH/TO/model_XXXX.pt --eval_mode full --rec_model_path PATH/TO/checkpoint_0100.pth.tar --use_ddim --timestep_respacing ddim5
If you want to get a table with mean and interval, you can use this script:
python -m eval.easy_table PATH/TO/evaluation_results_XXXX_full.yaml
Generate the results, and the results will be saved to results.npy
.
python -m sample.cgenerate --model_path PATH/TO/model_XXXX.pt --action_file assets/action_names_XXX.txt --num_repetitions 10 --dataset ntu --body_model smplx --num_person 2 --pose_rep rot6d --data_path PATH/TO/xsub.test.h5 --output_dir XXX
Render the results
Install additional dependencies
pip install trimesh
pip install pyrender
pip install imageio-ffmpeg
python -m render.crendermotion --data_path PATH/TO/results.npy --num_person 2 --setting cmdm --body_model smplx
We want to thank the following contributors that our code is based on:
ACTOR, motion diffusion model, guided diffusion, text-to-motion, HumanML3D
This code is distributed under an MIT LICENSE.
Note that our code depends on other libraries, including CLIP, SMPL, SMPL-X, PyTorch3D, and uses datasets that each have their own respective licenses that must also be followed.
If you find ReGenNet is useful for your research, please cite us:
@inproceedings{xu2024regennet,
title={ReGenNet: Towards Human Action-Reaction Synthesis},
author={Xu, Liang and Zhou, Yizhou and Yan, Yichao and Jin, Xin and Zhu, Wenhan and Rao, Fengyun and Yang, Xiaokang and Zeng, Wenjun},
booktitle={CVPR},
pages={1759--1769},
year={2024}
}