Why is the accuracy of the training set lower than the validation set?

Kiteflyingee commented 5 years ago

Hi, I use sample_data.csv to train the model,but i get lower behavior on training data than validation set. I am confused.

Epoch 1/200 450/450 [==============================] - 5s 11ms/step - loss: 0.8987 - acc: 0.4933 - val_loss: 0.7807 - val_acc: 0.4286 Epoch 2/200 450/450 [==============================] - 1s 2ms/step - loss: 0.7921 - acc: 0.5356 - val_loss: 0.6995 - val_acc: 0.5306 Epoch 3/200 450/450 [==============================] - 1s 2ms/step - loss: 0.7451 - acc: 0.5644 - val_loss: 0.6261 - val_acc: 0.5918 Epoch 4/200 450/450 [==============================] - 1s 2ms/step - loss: 0.6697 - acc: 0.6178 - val_loss: 0.5605 - val_acc: 0.7143 Epoch 5/200

amansrivastava17 commented 5 years ago

@Kiteflyingee Since this dataset is so small, this problem might occur so often. I suggest you use quora full dataset on Kaggle to get higher accuracy.

https://www.kaggle.com/quora/question-pairs-dataset

Kiteflyingee commented 5 years ago

@amansrivastava17 Thanks, I'm trying to extend my dataset!

amansrivastava17 / lstm-siamese-text-similarity

Why is the accuracy of the training set lower than the validation set? #10