Closed nehaboob closed 6 years ago
The dev results obtained from training mode will be different to running test mode because training mode doesn't use exponential moving average at inference. Ideally, the results obtained from test mode should be higher.
I was getting worse results in test mode. Then I commented below from test(config) function and now getting same results as training mode.
if config.decay < 1.0: sess.run(model.assign_vars)
If you train longer and let the exponential moving average variables settle, you will see better results with test mode.
Thanks, so typically how many steps ?
Just to try it out, I am training on just 600 questions and dev set is 60 questions and algorithm is running for 1500 steps. Looking at dev loss and training loss I can see its over fitting pretty quickly. I think its too less data for learning anything meaningful.
Let me know if you have an idea on minimum number of training questions and corresponding number of steps.
If you have a GPU to train on, usually 60,000 steps would get you to the best performance which takes about 6~8 hours depending on which GPU you have.
My test and dev sets are same. But I get different results from training check point evaluation vs running config.py in test mode.
Ideally it should give same results because we are loading the saved model and running it on dev file again ?