MixedEmotions / up_emotions_audio

This module aims to extract emotions from audio. The input argument is either an uploaded audio/video file to the server or a URL. The output is the predicted emotion in terms of Arousal and Valence within the JSON-LD format.
GNU General Public License v3.0
21 stars 8 forks source link

Wrong valence and arousal #4

Open bmond opened 6 years ago

bmond commented 6 years ago

I have tested with different kind of files where emotions are angry happy sad or calm. I am supposed to get valence and arousal values in such a way that 1st quadrant means excited/happy || 2nd quadrant means angry || 3rd quadrant means sad || 4th quadrant means calm. However for every expression I am getting same quadrant only and that is 1st. Am I doing something wrong? Do I need to train first with some data for better result or do I need to do noise filtering beforehand?

The result I am getting where arousal and valence as: angry1 arousal: 0.158374 valence: 0.530165

Angry2 arousal: 0.155097 valence: 0.163162

Calm arousal: 0.228325 valence: 0.144437

Happy1 arousal: 0.175005 valence: 0.350349

Happy2 arousal: 0.248143 valence: 0.200598

Sad1 arousal: 0.144986 valence: 0.358285

Sad2 arousal: 0.276521 valence: 0.342779

audioAnalytics.zip

hesamsagha commented 6 years ago

Hi bmond, The problem is the mismatched data distribution. The model has been trained on the RECOLA database (https://diuf.unifr.ch/diva/recola/). For more detail, see the paper: (https://www.isca-speech.org/archive/Interspeech_2016/pdfs/1124.PDF). You may retrain a new classifier for your problem. See the guidance here: (https://github.com/openXBOW/openXBOW). I hope it helps.