Vincent-ZHQ / CA-MSER

Code for Speech Emotion Recognition with Co-Attention based Multi-level Acoustic Information
130 stars 13 forks source link
icassp-2022 iemocap pytorch-implementation speech-emotion-recognition

CA-MSER

Code for Speech Emotion Recognition with Co-Attention based Multi-level Acoustic Information (ICASSP 2022)

NEW Update

The code for data processing is available online now. It can be downloaded and used as a reference.
If you think our paper and code are useful for your research work. Please give us a star or cite our original paper. This will give us the motivation to continue to share our code.

1. File system

- models
  -- transformers_encoder
  -- related python files
- results
  -- t-SNE
- extracted_features.pkl
- crossval_SER.py
- train_ser.py
- data_utils.py
- requirements.txt

2. Environmet

3. How to use

  1. Download the pretrained Wav2vec2.0 model from https://huggingface.co/facebook/wav2vec2-base-960h

  2. Download the processed data. (It is a little big, later we will delete it from Google Drive) Google Drive; Baidu YunPan

  3. Install related libraries. pip install requirements.txt

  4. Run. python crossval_SER.py

citation

If you use our code or find our CA-MSER useful in your research, please consider citing:

@inproceedings{zou2022speech,
    title={Speech Emotion Recognition with Co-Attention Based Multi-Level Acoustic Information},
    author={Zou, Heqing and Si, Yuke and Chen, Chen and Rajan, Deepu and Chng, Eng Siong},
    booktitle={ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
    pages={7367--7371},
    year={2022},
    organization={IEEE}
}