Open wuxchcandoit opened 10 months ago
thank you for your interest! To get the global feature for classification, we found both max-pooling and learning pooling are applicable. However learnable pooling is more stable across different datasets. Thus, this implementation is slightly different from the paper.
Hi! Thanks for your great work! I wonder why there is a learnable variable named
S
in classNLP
, namelyself.S= nn.Parameter(torch.Tensor(1, num_seeds, dim))
because the paper mentions that a bag feature is obtained through max pooling over the output of ILRA layers. But in classNLP
, it seems that the bag feature is learnable.