Is this a bug? - Githubissues

bloomsburyai / question-generation

Neural text-to-text question generation

MIT License

217 stars 52 forks source link

Is this a bug? #13

Closed shizhediao closed 5 years ago

shizhediao commented 5 years ago

In the eval.py, there is a piece of code: if len(dev_data) < FLAGS.num_eval_samples: exit('***ERROR*** Eval dataset is smaller than the num_eval_samples flag!') if len(dev_data) > FLAGS.num_eval_samples: print('***WARNING*** Eval dataset is larger than the num_eval_samples flag!') There is an error here, when I run this code, "ERROR Eval dataset is smaller than the num_eval_samples flag!" occured.

In my understanding, the dev_data means the dataset used in the development process while the num_eval_samples means the number of test data? Therefore, I think I should change len(dev_data) to len(test_data)? Am I right?

shizhediao commented 5 years ago

Or I should change the num_eval_samples to num_dev_samples?

tomhosking commented 5 years ago

I've been a bit lazy with my variable names here! dev_data just refers to the dataset that is being evaluated. You can select whether to evaluate on your dev/validation set by setting flag --eval_on_dev or on the test set with --eval_on_test. You should then set --num_eval_samples to the length of that dataset (or lower if you want to evaluate on a subset only).

As you point out the defaults aren't quite correct - num_eval_samples is set to the length of the test set, but eval_on_dev is set to True. I'd recommend adding the flag --eval_on_test --noeval_on_dev which should then give you the scores for the test set.

shizhediao commented 5 years ago

I've been a bit lazy with my variable names here! dev_data just refers to the dataset that is being evaluated. You can select whether to evaluate on your dev/validation set by setting flag --eval_on_dev or on the test set with --eval_on_test. You should then set --num_eval_samples to the length of that dataset (or lower if you want to evaluate on a subset only).

As you point out the defaults aren't quite correct - num_eval_samples is set to the length of the test set, but eval_on_dev is set to True. I'd recommend adding the flag --eval_on_test --noeval_on_dev which should then give you the scores for the test set.

OK Thanks!