x4nth055 / emotion-recognition-using-speech

Building and training Speech Emotion Recognizer that predicts human emotions using Python, Sci-kit learn and Keras
MIT License
572 stars 229 forks source link

Rnn in deep learning usage is problematic in terms of feature space #25

Closed eshalev4 closed 1 year ago

eshalev4 commented 3 years ago

Looking at this blog, this tutorial, the Wikipedia entery, and any other material, shows us that RNNs are experts at extracting the time-sequential information in our data.

The features you extract are averaged across the time domain. Wouldn't it be better to feed the network with temporal information when using LSTM or GRUs?

x4nth055 commented 3 years ago

Hello @eshalev4 ,

You're absolutely right, the extract_feature() function from utils.py uses np.mean() for taking the average, pull requests are welcomed if you want to contribute!