An Active Learning Paradigm for Online Audio-Visual Emotion Recognition

This repository contains the official code for developing an online emotion recognition classifier using audio-visual modalities and deep reinforcement learning techniques introduced here.

Combined with corresponding repositories for preprocessing unimodal and multi-modal emotional datasets, like AffectNet, IEMOCAP, RML, BAUM-1, to produce the papers results.

Preprocessing codes for AffectNet, IEMOCAP and RML are provided by the authors, here, here and here, respectively.

If you find this repository useful in your research, please consider citing:

@article{kansizoglou2019active,
  title={An Active Learning Paradigm for Online Audio-Visual Emotion Recognition},
  author={Kansizoglou, Ioannis and Bampis, Loukas and Gasteratos, Antonios},
  journal={IEEE Transactions on Affective Computing},
  year={2019}
}

Prerequisites

Vggish weights converted to PyTorch from this repository and included in ./data/weights/ path with name pytorch_vggish.pth.
Provided code is tested in Python 3.7.4 and Pytorch 1.4.0.

Key Notes from Paper

TO BE UPDATED

Training Strategies

Inputs Format

The params.json sets the training hyper-parameters, the exploited modality from the set {"audio", "visual", "fusion"} and the name of the speaker that is subtracted from the training dataset for evaluation. Note that Leave-One-Speaker-Out and Leave-One-Speakers-Group-Out schemes are adopted. The names of the two files shall be training_data.csv and evaluation_data.csv

The following models are trained through two .csv files, including the paths of the training and evaluation samples, respectively. Those files shall be stored inside ./data/speaker_folder, where speaker_folder shall be given to the "speaker" variable in the params.json file.

Usage

Run python3 main.py train or simply python3 main.py to train the model.

In order to test the model on the validation data run python3 main.py test.

IoannisKansizoglou / Active-Emotion-Recognition

readme