xuanjihe / speech-emotion-recognition

speech emotion recognition using a convolutional recurrent networks based on IEMOCAP
389 stars 142 forks source link

Can you help me to implement it on keras #8

Closed cpankajr closed 6 years ago

cpankajr commented 6 years ago

Hey I tried to implement your paper in keras on EMODB database where time step is 300 and 5 conv2d layers followed by one blstm and attention layer training size is (339,300,40,30) but not getting the same accuracy as yours training accuracy is only 20% I don't know where I an doing wrong can you please look at the code and let me know what I am doing wrong

inputs = Input(shape=(300, 40,3))
    CNN1=Convolution2D(128, 5, 3, activation='relu', border_mode='same', name='conv1', subsample=(1, 1))(inputs)
    MAX_POOL=MaxPooling2D(pool_size=(1, 4),border_mode='valid', name='pool1')(CNN1)
    BN1=BatchNormalization()(MAX_POOL)
    CNN2=Convolution2D(256, 5,3, activation='relu', 
        border_mode='same', name='conv3a',
        subsample=(1, 1))(BN1)
    MAX_POOL2=MaxPooling2D(pool_size=(1, 2),border_mode='valid', name='pool2')(CNN2)
    BN2=BatchNormalization()(MAX_POOL2)
    CNN3=Convolution2D(256, 5,3, activation='relu', 
        border_mode='same', name='conv3b',
        subsample=(1, 1))(BN2)
    DROP1=Dropout(.5)(CNN3)
    BN3=BatchNormalization()(DROP1)

    CNN4=Convolution2D(256, 5,3, activation='relu', 
        border_mode='same', name='conv3c',
        subsample=(1, 1))(BN3)
    BN4=BatchNormalization()(CNN4)
    CNN5=Convolution2D(256, 5,3, activation='relu', 
        border_mode='same', name='conv3d',
        subsample=(1,1))(BN4)
    BN5=BatchNormalization()(CNN5)
    DROP2=Dropout(.5)(BN5) 
    TD=TimeDistributed(Flatten(), name="Flatten")(DROP2)

    DENSE1=Dense(768, activation='linear', name='fc6')(TD)
    BLSTM=Bidirectional(LSTM(128,return_sequences=True,unit_forget_bias=True))(DENSE1)
    gru = AttentionLayer(name='attention')(BLSTM)
    DROP3=Dropout(0.5)(gru)
    DENSE2=Dense(64, activation='relu', name='fc7')(DROP3)
    BN3=BatchNormalization()(DENSE2)
    DROP4=Dropout(0.5)(BN3)
    DENSE3=Dense(7, activation='softmax')(DROP4)
    model = Model(input=inputs, output=DENSE3)

I have trained model using nadam optimizer which adam with nesterov moment

nadam = Nadam(lr=1e-04, beta_1=0.9, beta_2=0.999, epsilon=1e-08,schedule_decay=0.004)
model.compile(optimizer='nadam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train,y_train, callbacks=[tensorboard], batch_size=15,nb_epoch = 10,shuffle=True,
        validation_data=(X_val,y_val))
xuanjihe commented 6 years ago

I am sorry that I have never use Keras before, I suggest you to use it on IEMOCAP to check the code.

cpankajr commented 6 years ago

OK will do that. thank you

dyf102 commented 5 years ago

@cpankajr Did you solve this issue ?