Evaluation Result Correct?

wailoktam commented 8 years ago

To save time, I set np_epoch to 2 and the program only displays 1 epoch. I choose that epoch and evaluate it against the test sets: The top 1 precision figures seem to 1/10 of what the paper claims? Or do I misunderstand something?

Epoch 1/1 14832/14832 [==============================] - 236s - loss: 0.0297 - val_loss: 0.0154 Best: Loss = 0.0154112447405, Epoch = 1 2016-06-14 08:22:54 :: ----- test1 ----- [====================]Top-1 Precision: 0.049444 MRR: 0.131885 2016-06-14 08:46:11 :: ----- test2 ----- [====================]Top-1 Precision: 0.040000 MRR: 0.124294 2016-06-14 09:09:09 :: ----- dev ----- [====================]Top-1 Precision: 0.053000 MRR: 0.128266

eshijia commented 8 years ago

The reason is the code for i in range(1, nb_epoch). The i will reach nb_epoch - 1. Therefore, you can change the code to for i in range(1, nb_epoch+1) Of course, one epoch is not enough.

wailoktam commented 8 years ago

Hi, thanks for your response. I have tried running 100 epochs and the result (top 1 precision) looks close to what is posted here. However, this is still quite different from the figure given in the papers (they report 60+ % top 1 precision for dense+cnn+1 max and attention-lstm). Can we reasonably doubt the result reported in these papers, provided that we have done nothing wrong in the implementation here?

codekansas / keras-language-modeling

Evaluation Result Correct? #10