localminimum / QANet

A Tensorflow implementation of QANet for machine reading comprehension
MIT License
983 stars 310 forks source link

optimize the trilinear function for lower memory cost; add early stop #14

Closed jasonwbw closed 6 years ago

jasonwbw commented 6 years ago
  1. Reduce the memory cost of trilinear from n BatchSize C_len Q_len HiddenSize to n BatchSize C_len Q_len, n is an integer. Which is about n 0.23G -> n 0.002G for HiddenSize 96, and about n 0.31G -> n * 0.002G for HiddenSize 128.
  2. Add early stop as current pipeline just save the last five models, it's very easy to overfitting so that we can't get the actually EM and F1 of dev set (the log is reported by the max length as 400, which is lower than the actual numbers).

Btw, I'm sorry to bring in lots of commits when I pull the version of upstream. It may be cost by the bug of the Github windows GUI client I used.

jasonwbw commented 6 years ago

Oh, I see. I will recommit the changes as you said.