allenai / bilm-tf

Tensorflow implementation of contextualized word representations from bi-directional language models
Apache License 2.0
1.62k stars 452 forks source link

[ Question ] When does the inner state of the LSTMs reset during training? #220

Closed tsteffek closed 4 years ago

tsteffek commented 4 years ago

Just as a sanity check; The shuffle_on_load flag made me believe, that the LSTMs' hidden state would be reset every sentence; however, diving into the code it seems it actually is not reset during entire training. Is this correct?

Might be a novice question, I'm new to Tensorflow I'm afraid.

matt-peters commented 4 years ago

That's correct, the LSTM hidden states are never reset during training.

tsteffek commented 4 years ago

Thanks for the quick answer.

Would you be interested in a PR adding a flag for document-wise training?

EDIT: Just had a talk with my boss again, seems like we're not doing that in the foreseeable future.