Lstm Learnable Hidden State

flairNLP / flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

https://flairnlp.github.io/flair/

Other

13.82k stars 2.09k forks source link

Lstm Learnable Hidden State #897

Closed myaldiz closed 5 years ago

myaldiz commented 5 years ago

Hi,

As far as I noticed, Lstm module in Pytorch initializes hidden state as zero. In some blog posts (also by Hinton), it is recommended to initialize it as a learnable parameter. I believe between initial hidden state and next hidden states, the network got affected less by covariate shift when the initial hidden state learned. It would be great if it can be implemented, it may boost accuracy.

Thanks.

alanakbik commented 5 years ago

Yes would be interesting to try out - care to do a PR for this? :)

myaldiz commented 5 years ago

pull request sent