[ICCV'23] Official Implementation of Sequential Texts Driven Cohesive Motions Synthesis with Natural Transitions
Please visit our webpage for more details.


Getting Started

This code was tested on Ubuntu 20.04.1 LTS and requires:

1. Setup environment

Setup conda env:

conda env create -f environment.yaml
conda activate ST2M_env
python -m spacy download en_core_web_sm
pip install git+https://github.com/openai/CLIP.git

2. Dataset preparation

mkdir ./dataset/


The BABEL-TEACH dataset are currently not available due to the license of BABEL. We will release the data preprocessing code in the future.


If you want to request our STDM dataset please fill out the "The STDM Dataset Release Agreement", send the email to sisizhuang@buaa.edu.cn

Unzip and place BABEL_TEACH dataset to ./dataset/

3. Download the pretrained models

mkdir ./checkpoints/

Download the pretrained models of BABEL_TEACH dataset and STDM dataset, then unzip and place them in ./checkpoints/, which should be like

./checkpoints/BABEL_TEACH/trainV13_LV1LT1LK001LA01_BABEL_TEACH/           # Sequential-text-to-motion generation model
./checkpoints/BABEL_TEACH/trainV13_LV1LT1LK001LA01_BABEL_TEACH_slerp/           # Sequential-text-to-motion generation model with slerp
(The model is the same as the model without slerp operation, split into two folders just to facilitate final evaluations.)
./checkpoints/BABEL_TEACH/Decomp_SP001_SM001_H512/ # Motion autoencoder
./checkpoints/BABEL_TEACH/text_mot_match_M10_BABEL_TEACH/          # Motion & Text feature extractors for evaluation

./checkpoints/STDM/trainV13_LV1LT1LK001LA01_STDM/           # Sequential-text-to-motion generation model
./checkpoints/STDM/trainV13_LV1LT1LK001LA01_STDM_slerp/           # Sequential-text-to-motion generation model with slerp
(The model is the same as the model without slerp operation, split into two folders just to facilitate final evaluations.)
./checkpoints/STDM/Decomp_SP001_SM001_H512/ # Motion autoencoder
./checkpoints/STDM/text_mot_match_M10_STDM/          # Motion & Text feature extractors for evaluation

Training Models

Train Sequential-text-to-motion model


python st2m_train.py --name ST2M_model_name --gpu_id 0 --dataset_name BABEL_TEACH

STDM dataset

python st2m_train.py --name ST2M_model_name --gpu_id 0 --dataset_name STDM

Training motion & text feature extractors


python st2m_train_tex_mot_match.py --name match_model_name --gpu_id 0 --dataset_name BABEL_TEACH

STDM dataset

python st2m_train_tex_mot_match.py --name match_model_name --gpu_id 0 --dataset_name STDM

Generating 3D Motion Animation


without slerp operation

python st2m_gen_mul_motions_scipy_V2.py --gpu_id 0 --dataset_name BABEL_TEACH --name trainV13_LV1LT1LK001LA01_BABEL_TEACH --text_file ./inputs_texts/BABEL_TEACH/0.txt --ext 0 --repeat_times 3

with slerp operation

python st2m_gen_mul_motions_scipy_V2.py --gpu_id 0 --dataset_name BABEL_TEACH --name trainV13_LV1LT1LK001LA01_BABEL_TEACH --text_file ./inputs_texts/BABEL_TEACH/0.txt --ext 0_slerp --repeat_times 3 --do_slerp

STDM dataset

without slerp operation

python st2m_gen_mul_motions_scipy_V2.py --gpu_id 0 --dataset_name STDM --name trainV13_LV1LT1LK001LA01_STDM --text_file ./inputs_texts/STDM/0.txt --ext 0 --repeat_times 3

with slerp operation

python st2m_gen_mul_motions_scipy_V2.py --gpu_id 0 --dataset_name STDM --name trainV13_LV1LT1LK001LA01_STDM --text_file ./inputs_texts/STDM/0.txt --ext 0_slerp --repeat_times 3 --do_slerp

Quantitative Evaluations


python st2m_final_evaluations.py --dataset_name BABEL_TEACH --log_file_name final_contrast

STDM dataset

python st2m_final_evaluations.py --dataset_name STDM --log_file_name final_contrast


