suriyadeepan / practical_seq2seq

A simple, minimal wrapper for tensorflow's seq2seq module, for experimenting with datasets rapidly
http://suriyadeepan.github.io/2016-12-31-practical-seq2seq/
GNU General Public License v3.0
569 stars 270 forks source link

Training Time ? #19

Open prakhar21 opened 7 years ago

prakhar21 commented 7 years ago

Hi, I am trying to train a Q/A model on my data, using your sequence wrapper. The checkpoint model that you have given is the result of 45k epoch roughly. How much time did it take to train this model ? Also, by what heuristics did you reach epoch at this scale ?

suriyadeepan commented 7 years ago

I stopped when the validation loss saturated. It took roughly 7-8 hours to run ~45k iterations in GTX960 (i5 processor).

prakhar21 commented 7 years ago

ok. The modeling approach you have described in your blog and code. This is more or less generalized, right? and should work for any Q/A dataset.

prakhar21 commented 7 years ago

I am re-training on the dataset with same the code, val. score loss is increasing significantly after 2000 epoch. At 2000 it was 3.12. Why is it diverging ?