EdGENetworks / attention-networks-for-classification

Hierarchical Attention Networks for Document Classification in PyTorch
603 stars 132 forks source link

single lstm for all sentences #3

Closed ghost closed 7 years ago

ghost commented 7 years ago

According to my understanding, the LSTM trained on different sentence should be different, but according to your model, each sentence has the same LSTM parameters for wordAttnRNN?

Sandeep42 commented 7 years ago

The paper does not seem to mention different RNN's trained for different sentences. We force the classifier in the end to learn a shared representation among sequence of sentences and sequence of words. This shared representation helps the model to generalise well to unseen data.

There is nothing stopping you from training different GRU's though, but I think it will be a futile effort.