facebookresearch / InferSent

InferSent sentence embeddings
Other
2.28k stars 471 forks source link

"Inner Attention NAACL Encoder" implementation #88

Open gabrer opened 5 years ago

gabrer commented 5 years ago

I was having a look at the implementation of the "InnerAttentionNAACLEncoder" which should be the sentence encoder from the "Hierarchical Attention Networks for Document Classification" by Tang et al. 2016.

However, I would raise the following issues:

yeoserene commented 5 years ago

From what I understand, self.query_embedding is initialized as a nn.Embedding(number_of_context_layers_you_want, dim_of_LSTM) layer which has gradient and that means it's trainable.

When you do this: sent_w = self.query_embedding(torch.LongTensor(bsize*[0]).cuda()).unsqueeze(2), you are not randomly initializing it on each pass. The [0] means the first layer of the total number of layers you declared. You are actually getting the parameters of the query embedding. It's like the concept of embedding[0].