RayanWang / Speech_emotion_recognition_BLSTM

Bidirectional LSTM network for speech emotion recognition.
MIT License
260 stars 78 forks source link

how do you deal with the variant length of audio #14

Closed CZFuChason closed 5 years ago

CZFuChason commented 5 years ago

Hi RayanWang,

I have gone through your code these days, thank you so much for your sharing and it is really nice work.

But I still have a question, can you tell me in which part of your code is to deal with the length of the audio data. I also work on Berlin Dataset, but the audio has a different length from each other. I used the padding method but the results were not that good as yours.

I am looking forward to getting your reply.

Chason

RayanWang commented 5 years ago

Hi CZFuChason,

I am using sequence.pad_sequences to fill the data with a special value to keep all data length consistent. And then in creating the network, use Masking method to ignore this value during training.