[textsum]Attention heatmap error

rockingdingo / deepnlp

Deep Learning NLP Pipeline implemented on Tensorflow

MIT License

1.35k stars 711 forks source link

[textsum]Attention heatmap error #17

Open JuruoMP opened 7 years ago

JuruoMP commented 7 years ago

Compared with the headmap showing in "A Neural Attention Model for Abstractive Sentence Summarization"(A.M.Rush), there are some problems here.

A heatmap should map the attention relationship between input and output, which means that it should have a higher attention if the output word is copied from input sentence. However, the heatmap on this project is just a mass, which is hard to get a point of which input word attentions which output word.

rockingdingo commented 7 years ago

Hi, Thanks for your feedback. The method in the project is a little bit tricky and just a short-term solution to get the Attn-Mask tensor out (tensorflow 1.0). Since tensorflow 1.2 already deleted the old archived RNN functions and changed to a more common dynamic_rnn() function. I will find a more elegent way to do that in the new tf version. Any ideas are very welcome.

JuruoMP commented 7 years ago

@rockingdingo Thanks for replying. And I have another question. Lots of methods of visualization has been focused recently, such as sensitivity analysis and relevance propogation. In which method are you going to use to generate the heat map?