liangxuy / Inter-X

[CVPR 2024] Official implementation of the paper "Towards Versatile Human-Human Interaction Analysis"
https://liangxuy.github.io/inter-x/
Other
128 stars 7 forks source link
cvpr2024 dataset-and-benchmark human-human-interaction human-interaction-generation human-motion-generation reaction-generation text-motion-dataset whole-body

Inter-X: Towards Versatile Human-Human Interaction Analysis

This repository contains the content of the following paper:

Inter-X: Towards Versatile Human-Human Interaction Analysis
Liang Xu1,2, Xintao Lv1, Yichao Yan1, Xin Jin2, Shuwen Wu1, Congsheng Xu1, Yifan Liu1, Yizhou Zhou3, Fengyun Rao3, Xingdong Sheng4, Yunhui Liu4, Wenjun Zeng2, Xiaokang Yang1
1 Shanghai Jiao Tong University 2 Eastern Institute of Technology, Ningbo 3WeChat, Tencent Inc. 4Lenovo

News

TODO

Plese stay tuned for any updates of the dataset and code!

Dataset Comparison

Dataset Download

Please fill out this form to request authorization to download Inter-X for research purposes.

We also provide the 40 action categories and train/val/test splittings under the folder of datasets.

Data visualization

1. Visualize the SMPL-X parameters

We recommend to use the AIT-Viewer to visualize the dataset.

pip install aitviewer
pip install -r visualize/smplx_viewer_tool/requirements.txt

Installation

You need to download the SMPL-X models and then place them under visualize/smplx_viewer_tool/body_models.

├── SMPLX_FEMALE.npz
├── SMPLX_FEMALE.pkl
├── SMPLX_MALE.npz
├── SMPLX_MALE.pkl
├── SMPLX_NEUTRAL.npz
├── SMPLX_NEUTRAL.pkl
└── SMPLX_NEUTRAL_2020.npz

Usage

cd visualize/smplx_viewer_tool
# 1. Make sure the SMPL-X body models are downloaded
# 2. Create a soft link of the SMPL-X data to the smplx_viewer_tool folder
ln -s Your_Path_Of_SMPL-X ./data
# 3. Create a soft link of the texts annotations to the smplx_viewer_tool folder
ln -s Your_Path_Of_Texts ./texts
python data_viewer.py

2. Visualize the skeleton parameters

Usage

cd visualize/joint_viewer_tool
# 1. Create a soft link of the skeleton data to the joint_viewer_tool folder
ln -s Your_Path_Of_Joints ./data
# 2. Create a soft link of the texts annotations to the joint_viewer_tool folder
ln -s Your_Path_Of_Texts ./texts
python data_viewer.py

Data Loading

Each file/folder name of Inter-X is in the format of GgggTtttAaaaRrrr (e.g., G001T000A000R000), in which ggg is the human-human group number, ttt is the shoot number, aaa is the action label, and rrr is the split number.

The human-human group number is aligned with the big_five, familiarity annotations. The human-human group number starts from 001 to 059, the action label starts from 000 to 039.

The directory structure of the downloaded dataset is:

Inter-X_Dataset
├── LICENSE.md
├── annots
│   ├── action_setting.txt # 40 action categories
│   ├── big_five.npy # big-five personalities
│   ├── familiarity.txt # familiarity level, from 1-4, larger means more familiar
│   └── interaction_order.pkl # actor-reactor order, 0 means P1 is actor; 1 means P2 is actor
├── splits # train/val/test splittings
│   ├── all.txt
│   ├── test.txt
│   ├── train.txt
│   └── val.txt
├── motions.zip # SMPL-X parameters at 120 fps
├── skeletons.zip # skeleton parameters at 120 fps
└── texts.zip # textual descriptions

load the motion data

motion = np.load('motions/G001T000A000R000/P1.npz') motion_parms = { 'root_orient': motion['root_orient'], # controls the global root orientation 'pose_body': motion['pose_body'], # controls the body 'pose_lhand': motion['pose_lhand'], # controls the left hand articulation 'pose_rhand': motion['pose_rhand'], # controls the right hand articulation 'trans': motion['trans'], # controls the global body position 'betas': motion['betas'], # controls the body shape 'gender': motion['gender'], # controls the gender }

- To load the skeleton data you can simply do:
```python
# The topology of the skeleton can be obtained in the OPTITRACK_LIMBS, SELECTED_JOINTS of the joint_viewer_tool/data_viewer.py
import numpy as np
skeleton = np.load('skeletons/G001T000A000R000/P1.npy') # skeleton.shape: (T, 64, 3)

Data preprocessing

We directly use the SMPL-X parameters to train the model, you can download the processed motion data, text data through the original Google drive link or the Baidu Netdisk.

processed_data
├── glove
│   ├── hhi_vab_data.npy
│   ├── hhi_vab_idx.pkl
│   └── hhi_vab_words.pkl
├── motions
│   ├── test.h5
│   ├── train.h5
│   └── val.h5
└── texts_processed
    ├── G001T000A000R000.txt
    ├── G001T000A000R001.txt
    └── ......

Or, the data preprocessing code is shown as follows:

Commands for preprocessing the Inter-X dataset for training and evaluation: 1. Please clone the repository by the following command: ``` git clone https://github.com/liangxuy/Inter-X.git cd Inter-X/preprocessing ``` 2. Setup the environment * Install ffmpeg (if not already installed) ``` sudo apt update sudo apt install ffmpeg ``` * Setup conda environment ``` conda env create -f environment.yml conda activate inter-x python -m spacy download en_core_web_sm pip install git+https://github.com/openai/CLIP.git ``` You can also manually download and install en_core_web_sm by download the [en_core_web_sm-2.3.0.tar.gz](https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.3.0/en_core_web_sm-2.3.0.tar.gz) and then run `pip install en_core_web_sm-2.3.0.tar.gz`. 3. Prepare the `motions.zip`, `texts.zip`, `splits`, etc. 4. Run the commands one by one: * 1. Motion data processing, we downsample to 30 fps for training and evaluation ``` python 1_prepare_data.py ``` * 2. Split train, test and val ``` python 2_split_train_val.py ``` * 3. Processing text annotations Download the [glove.6B.zip](https://nlp.stanford.edu/data/glove.6B.zip) and set the path of `glove_file`. ``` python 3_text_process.py ``` * 4. For human reaction generation ``` python 4_reaction_generation.py ```

Text to Motion

The code of this part is under evaluation/text2motion. We follow the work of text-to-motion to train and evaluate the text2motion model. You can build the environment as the original repository and then setup the data folder to ./dataset/inter-x with a soft link.

Commands for training and evaluating the text2motion model: We have provided the trained models on the dataset Google Drive link. You can download the checkpoints and put them to `checkpoints/hhi` to **skip the step 1~4** and organize them as: ``` checkpoints/hhi ├── Comp_v6_KLD01 │   ├── model │   │   └── latest.tar │   └── opt.txt ├── Decomp_SP001_SM001_H512 │   └── model │   └── latest.tar ├── length_est_bigru │   └── model │   └── latest.tar └── text_mot_match └── model └── finest.tar ``` 1. Training motion autoencoder ``` python train_decomp_v3.py --name Decomp_SP001_SM001_H512 --gpu_id 0 --window_size 24 --dataset_name hhi ``` 2. Training text2length model ``` python train_length_est.py --name length_est_bigru --gpu_id 0 --dataset_name hhi ``` 3. Training text2motion model ``` python train_comp_v6.py --name Comp_v6_KLD01 --gpu_id 0 --lambda_kld 0.01 --dataset_name hhi ``` 4. Training motion & text feature extractors ``` python train_tex_mot_match.py --name text_mot_match --gpu_id 0 --batch_size 8 --dataset_name hhi ``` 5. Quantitative Evaluations ``` python final_evaluation.py ``` The statistical results will be saved to ./hhi_evaluation.log.

Action to Motion

Coming soon!

Citation

If you find the Inter-X dataset is useful for your research, please cite us:

@inproceedings{xu2024inter,
  title={Inter-x: Towards versatile human-human interaction analysis},
  author={Xu, Liang and Lv, Xintao and Yan, Yichao and Jin, Xin and Wu, Shuwen and Xu, Congsheng and Liu, Yifan and Zhou, Yizhou and Rao, Fengyun and Sheng, Xingdong and others},
  booktitle={CVPR},
  pages={22260--22271},
  year={2024}
}