bagustris / Apsipa2019_SpeechText

Repository for code and paper submitted for APSIPA 2019, Lanzhou, China
22 stars 5 forks source link

APSIPA2019_SpeechText

Repository for code, paper, and slide ~submitted~ presented ~for~ at APSIPA 2019:

Speech emotion recognition Using Speech Feature and Word Embedding

Pre-processing:
Run the following file with some adjusments (location of IEMOCAP data, output file name, etc.).
https://github.com/bagustris/Apsipa2019_SpeechText/blob/master/code/python_files/mocap_data_collect.py

Main codes:

Other (python) files can be explored and run indepently.

In case of the jupyter notebook is not rendered by Github, see the following nbviewer instead:

By employing acoustic feature from voice parts of speech and word embedding from text we got boost accuracy of 75.49%. Here the list of obtained accuracy from different models (Text+Speech):

------------------------------
Model           | Accuracy (%)
------------------------------
Dense+Dense     | 63.86
Conv1D+Dense    | 68.82
LSTM+BLSTm      | 69.13
LSTM+Dense      | 75.49 
-----------------------------

Sample of feature

Due to Github's limitation, a sample of feature can be downloaded here (voiced feature without SIL removal): https://cloud.degoo.com/share/Ov563dopNnEW14jN_DeBig. You can use the following script inside code/python_files directory to generate that feature file: https://github.com/bagustris/Apsipa2019_SpeechText/blob/master/code/python_files/save_feature.py

Citation

B.T. Atmaja, M. Akagi, K. Shirai. "Speech Emotion Recognition from Speech Feature and Word Embedding", 
In Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), IEEE, 2019.