HPI-DeepLearning / crnn-lid

Code for the paper Language Identification Using Deep Convolutional Recurrent Neural Networks
GNU General Public License v3.0
105 stars 48 forks source link

Training convergence #8

Open lvaleriu opened 5 years ago

lvaleriu commented 5 years ago

I'm trying to train the crnn network using latest keras on 2 languages (English and French) on a "youtube spoken" dataset. But it seems the validation accuracy (and not only) blocks at 0.5.

Could you give me some advices about that? I'd like to share in fact some trained models using the latest keras version for the different models you've implemented.

Thanks again!

` bidirectional_1 (Bidirection (None, 512) 1574912


dense_1 (Dense) (None, 2) 1026

Total params: 8,444,418 Trainable params: 8,439,938 Non-trainable params: 4,480


None WARNING:tensorflow:Variable = will be deprecated. Use variable.assign_mul if you want assignment to the variable value or 'x = x y' if you want a new python Tensor object. Epoch 1/50 16384/16384 [==============================] - 6176s 377ms/step - loss: 0.1435 - acc: 0.9892 - recall: 0.9999 - precision: 0.5000 - val_loss: 1.6510 - val_acc: 0.6196 - val_recall: 1.0000 - val_precision: 0.5000

Epoch 00001: val_acc improved from -inf to 0.61963, saving model to logs/2018-10-12-02-34-27/weights.01.model Epoch 2/50 16384/16384 [==============================] - 6156s 376ms/step - loss: 0.0580 - acc: 0.9955 - recall: 1.0000 - precision: 0.5000 - val_loss: 4.5258 - val_acc: 0.5111 - val_recall: 1.0000 - val_precision: 0.5000

Epoch 00002: val_acc did not improve from 0.61963 Epoch 3/50 16384/16384 [==============================] - 6152s 375ms/step - loss: 0.0390 - acc: 0.9964 - recall: 1.0000 - precision: 0.5000 - val_loss: 4.0108 - val_acc: 0.5033 - val_recall: 1.0000 - val_precision: 0.5000 `

Bartzi commented 5 years ago

hmm, looks like overfitting to the training data... did you try the models on some samples and look what it predicts?

lvaleriu commented 5 years ago

I've tested with some extracted audio waves from youtube videos and it works pretty well for english.

Bartzi commented 5 years ago

does it work on French at all?

lvaleriu commented 5 years ago

Yes, on a few random audio waves it seems coherent.

lvaleriu commented 5 years ago

I'll try another train soon and get back to you...

Bartzi commented 5 years ago

It definitely looks like overfitting on your dataset. I don't know why precision always shows 0.5 but if you look at the accuracy, you can see that the train accuracy improves, while the validation accuracy gets worse. Is your dataset large enough? Which model are you using?

lvaleriu commented 5 years ago

I've tried with topcoder_crnn.py, topcoder_small.py, topcoder_deeper.py, cnn.py & crnn.py. I have a dataset with 2 languages (english & french).

I've managed to obtain a 0.94 validation accuracy on this same dataset using https://github.com/stlong0521/language-detector project but I'd prefer using your keras project (and keras 2). I'm pretty sure I'm doing a stupid thing somewhere.

Is there a way we could chat on a slack channel ? (like https://kerasteam.slack.com)

lvaleriu commented 5 years ago

I can give you a private access to a jupyter notebook server if you are interested to see the setup.