MiteshPuthran / Speech-Emotion-Analyzer

The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
MIT License
1.3k stars 438 forks source link

Predicting longer audio files #41

Closed AmeeraMilibari closed 4 years ago

AmeeraMilibari commented 4 years ago

Hello, I want to ask how can I predict audio files with more than 2.5 seconds, I don't want to predict part of the audio only

MiteshPuthran commented 4 years ago

You can predict longer files as well. The only difference it would make is it will increase the number of features.

AmeeraMilibari commented 4 years ago

thank you @MITESHPUTHRANNEU I have asked before I have noticed that librosa.load takes agrs duration and offset, I understood that it will start after offset seconds and take only duration seconds`, when I changed these I got shape error from keras model

so we don't need to change anything except the file given right? I am following your code in final_result_gender_test.ipynb

AmeeraMilibari commented 4 years ago

I tried to use librosa.get_duration after loading, and I am getting 2.5 seconds for all wav files I tried, even if they're longer than 2.5 seconds

MiteshPuthran commented 4 years ago

Yes, you would reach the hurdle as the model is trained to take input of 2.5 seconds. Even if you supply it a longer file, it will clip it to 2.5 seconds with around 216 features I guess. If you want a model that would work on longer files then you would have to train your own model with the required time frame.

AmeeraMilibari commented 4 years ago

Thanks, this is clear now