Closed supakjk closed 8 years ago
As you point out, there is some redundant parameterization in the implementation due to these additional bias terms. This is intended in order to simplify the implementation (so that nn
modules can be used without modification). At any rate, there aren't many bias parameters, so this detail shouldn't have a significant impact on performance.
I see. Since recent nn.Linear code supports not using the bias term (by passing false as the 3rd argument), you can now evade the duplication without module modification.
Hello,
At the 40th and 41th lines of models/ChildSumTreeLSTM.lua, to compose the forget gate output, the outputs of both nn.TemporalConvolution and nn.Linear are summed. Since both of the modules have bias terms, additional bias term will be added differently to the formulation in the ACL2015 paper.
I wonder if I missed some part resolving the issue, it was intended, or a bug.
Thanks.