Kyubyong / transformer

A TensorFlow Implementation of the Transformer: Attention Is All You Need
Apache License 2.0
4.25k stars 1.29k forks source link

training data was used to eval #157

Closed hihell closed 4 years ago

hihell commented 4 years ago

in train.py line 77 _, _eval_summaries = sess.run([eval_init_op, eval_summaries]) this line will execute eval_summaries with train data, do:

sess.run(eval_init_op)       
_eval_summaries = sess.run(eval_summaries)

instead

bozhenhhu commented 4 years ago

Thank you. For the same epoch and batchsize, when I use _, _eval_summaries = sess.run([eval_init_op, eval_summaries]) ,eval/1/ BLEU = 41.36, 69.4/47.9/34.7/25.4 (BP=1.000, ration=1.028), however, when I use sess.run(eval_init_op)
_eval_summaries = sess.run(eval_summaries), the eval BLEU = 0.52, 23.0/1.6/0.3/0.1 (BP=0.487, ration=0.582), I can not figure out this reason, why do you think the first one involves with train data?

hihell commented 4 years ago

sess.run will use one batch to run all the operations in the array, meaning you run [eval_init_op, eval_summaries] with one batch from training data. after sess.run finished, the dataset was switch to eval dataset

hihell commented 4 years ago

@bozhenhhu with further check of code, I think bleu score should not be affected by change above, bleu is calculated by 'hypothesis' which will run y_hat by eval data in either way

the train data will only eval eval_summaries