A question about the architecture

sustcsonglin / TN-PCFG

source code of NAACL2021 "PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with Many Symbols“ and ACL2021 main conference "Neural Bilexicalized PCFG Induction"

45 stars 6 forks source link

A question about the architecture #4

Closed speedcell4 closed 1 year ago

speedcell4 commented 1 year ago

https://github.com/sustcsonglin/TN-PCFG/blob/7047645f874dcf872ed550d6bcd8d5d2b113d50c/parser/model/TN_PCFG.py#L20-L24

I found you only use root_emb and root_mlp in the following place,

https://github.com/sustcsonglin/TN-PCFG/blob/7047645f874dcf872ed550d6bcd8d5d2b113d50c/parser/model/TN_PCFG.py#L45

therefore, is this equivalent to simply self.root_emb = nn.Parameter(torch.randn(1, self.NT)) and just roots = self.root_emb.log_softmax(-1)?

sustcsonglin commented 1 year ago

ah this parameterization is directly copied from Kim et al, 2019. In my experience, if you directly use roots = self.root_emb.log_softmax(-1) the performance would degenerate and unfortunately I don't know why. Generative grammars are very sensitive to the way of parameterization :(

speedcell4 commented 1 year ago

Wow, that's interesting, thank you~