stanfordnlp / treelstm

Tree-structured Long Short-Term Memory networks (http://arxiv.org/abs/1503.00075)
GNU General Public License v2.0
875 stars 236 forks source link

Leaf module in paper? #10

Closed jamie-murdoch closed 7 years ago

jamie-murdoch commented 7 years ago

Hi!

Could you point to where the leaf module implemented in the BinaryTreeLSTM object is described in the paper? I can't seem to find it, but it seems like a non-trivial part of the model.

kaishengtai commented 7 years ago

Hi Jamie,

For the leaf module, the implementation gives h = o * tanh(Wx), whereas the definition in the paper gives h = o * tanh(i * tanh(Wx)). This is a simplification that is omitted from the paper. The reasoning is the following: since o already gates the output, i appears to be redundant and is therefore fixed to 1. Now tanh(tanh(z)) is approximately tanh(z), which gives the expression used in the implementation. In practice, I don't think that the delta here should make much of a difference.

Hope that helps.