Open prakhar21 opened 7 years ago
I stopped when the validation loss saturated. It took roughly 7-8 hours to run ~45k iterations in GTX960 (i5 processor).
ok. The modeling approach you have described in your blog and code. This is more or less generalized, right? and should work for any Q/A dataset.
I am re-training on the dataset with same the code, val. score loss is increasing significantly after 2000 epoch. At 2000 it was 3.12. Why is it diverging ?
Hi, I am trying to train a Q/A model on my data, using your sequence wrapper. The checkpoint model that you have given is the result of 45k epoch roughly. How much time did it take to train this model ? Also, by what heuristics did you reach epoch at this scale ?