Repository for code, paper, and slide ~submitted~ presented ~for~ at APSIPA 2019:
Speech emotion recognition Using Speech Feature and Word Embedding
Pre-processing:
Run the following file with some adjusments (location of IEMOCAP data, output file name, etc.).
https://github.com/bagustris/Apsipa2019_SpeechText/blob/master/code/python_files/mocap_data_collect.py
Main codes:
Other (python) files can be explored and run indepently.
In case of the jupyter notebook is not rendered by Github, see the following nbviewer instead:
By employing acoustic feature from voice parts of speech and word embedding from text we got boost accuracy of 75.49%. Here the list of obtained accuracy from different models (Text+Speech):
------------------------------
Model | Accuracy (%)
------------------------------
Dense+Dense | 63.86
Conv1D+Dense | 68.82
LSTM+BLSTm | 69.13
LSTM+Dense | 75.49
-----------------------------
Due to Github's limitation, a sample of feature can be downloaded here (voiced feature without SIL removal):
https://cloud.degoo.com/share/Ov563dopNnEW14jN_DeBig.
You can use the following script inside code/python_files
directory to generate that feature file: https://github.com/bagustris/Apsipa2019_SpeechText/blob/master/code/python_files/save_feature.py
B.T. Atmaja, M. Akagi, K. Shirai. "Speech Emotion Recognition from Speech Feature and Word Embedding",
In Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), IEEE, 2019.