keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.97k stars 19.46k forks source link

Has anyone implement this networks proposed in paper Hierarchical Attention Networks for Document Classification? #4495

Closed 460130107 closed 7 years ago

460130107 commented 7 years ago

The paper URL is https://www.researchgate.net/publication/305334401_Hierarchical_Attention_Networks_for_Document_Classification

124399839 commented 7 years ago

You can contact the author to request the code.

124399839 commented 7 years ago

欢迎加群367355275讨论。。。。

bkj commented 7 years ago

I think that the HN-MAX is (roughly)

max_sents = # maximum number of sentences per document
max_words = # maximum number of words per sentence

x = Input(shape=(max_sents, max_words,))

emb_words = TimeDistributed(Embedding(input_dim=max_features, output_dim=200, mask_zero=True))(x)

emb_sents = TimeDistributed(Bidirectional(GRU(50, consume_less='gpu', return_sequences=True)))(emb_words)
emb_sents = TimeDistributed(GlobalMaxPooling1D())(emb_sents)

emb_docs = Bidirectional(GRU(50, consume_less='gpu', return_sequences=True))(emb_sents)
emb_docs = GlobalMaxPooling1D()(emb_docs)

prediction = Dense(y_train.shape[1], activation='softmax')(emb_docs)
model = Model(input=x, output=prediction)
model.compile(loss='categorical_crossentropy', optimizer=SGD(momentum=0.9), metrics=['accuracy'])

Turning this into the HN-AVG variant is fairly straightforward, and to the HN-ATT you'd have to write a little attention unit, but that shouldn't be particularly difficult I don't think.

richliao commented 7 years ago

I have the paper implemented. Here is the blog https://richliao.github.io/ and the code is at https://github.com/richliao/textClassifier.

RahulKulhari commented 7 years ago

pytorch implementation of paper in pytorch https://github.com/EdGENetworks/attention-networks-for-classification by @EdGENetworks and @Sandeep42.

@zcyang please provide your feedback on implementation.

stale[bot] commented 7 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.