This is the readme to use the official code for the paper Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning. Please use the following citation if you find our work useful:
@inproceedings{bhattacharya2021speech2affectivegestures,
author = {Bhattacharya, Uttaran and Childs, Elizabeth and Rewkowski, Nicholas and Manocha, Dinesh},
title = {Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning},
year = {2021},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
booktitle = {Proceedings of the 29th ACM International Conference on Multimedia},
series = {MM '21}
}
Our scripts have been tested on Ubuntu 18.04 LTS with
We use $BASE to refer to the base directory for this project (the directory containing main_v2.py
). Change present working directory to $BASE.
conda create s2ag-env python=3.7
conda activate s2ag-env
espeak
.sudo apt-get update && sudo apt-get install espeak
Install PyTorch following the official instructions.
Install all other package requirements.
pip install -r requirements.txt
Note: You might need to manually uninstall and reinstall numpy
for torch
to work. You might need to manually uninstall and reinstall matplotlib
and kiwisolver
for them to work.
The Ted Gestures dataset is available for download here, originally hosted at https://github.com/ai4r/Gesture-Generation-from-Trimodal-Context.
The Trinity Gesture dataset is available for download on submitting an access request here.
Run the main_v2.py
file with the appropriate command line arguments.
python main_v2.py <args list>
The full list of arguments is available inside main_v2.py
.
For any argument not specificed in the command line, the code uses the default value for that argument.
On running main_v2.py
, the code will train the network and generate sample gestures post-training.
We also provide a pretrained model for download. If using this model, save it inside the directory $BASE/models/ted_db
(create the directory if it does not exist). Set the command-line argument --train-s2ag
to False
to skip training and use this model directly for evaluation. The generated samples are stored in the automatically created render
directory.
Additionally, we provide the pre-trained weights of the embedding network required to estimate the Fréchet Gesture Distance between the ground-truth and the synthesized gestures. If using these weights, store them in the directory $BASE/outputs
.