Closed speedcell4 closed 1 year ago
ah this parameterization is directly copied from Kim et al, 2019. In my experience, if you directly use roots = self.root_emb.log_softmax(-1)
the performance would degenerate and unfortunately I don't know why. Generative grammars are very sensitive to the way of parameterization :(
Wow, that's interesting, thank you~
https://github.com/sustcsonglin/TN-PCFG/blob/7047645f874dcf872ed550d6bcd8d5d2b113d50c/parser/model/TN_PCFG.py#L20-L24
I found you only use
root_emb
androot_mlp
in the following place,https://github.com/sustcsonglin/TN-PCFG/blob/7047645f874dcf872ed550d6bcd8d5d2b113d50c/parser/model/TN_PCFG.py#L45
therefore, is this equivalent to simply
self.root_emb = nn.Parameter(torch.randn(1, self.NT))
and justroots = self.root_emb.log_softmax(-1)
?