shyamupa / snli-entailment

attention model for entailment on SNLI corpus implemented in Tensorflow and Keras
178 stars 43 forks source link

The alpha value is not expected! #7

Open chungfu27 opened 7 years ago

chungfu27 commented 7 years ago

Hi @shyamupa , Thanks for your attention model!! I can get the alpha value to visualize the machine attention level for my task. But I found a strange phenomenon about alpha value. The following picture is the heatmap output of "flat_alpha" layer: attention_flat_alpha_export It looks well!!! But I exported the output of "alpha" layer (through softmax), I got this follwoing result: attention_alpha_softmax_export I know softmax will sharpen and normalize the result, but I also used flat_alpha data to do softmax function in my local and the following result is different from the output of "alpha" layer: attention_alpha_softmax_local The heatmap shape is (20, 200), there are 20 sentences and every sentence length is 200. Do you have any suggestion for this?

Maggione commented 7 years ago

Thanks for your attention model again. I agree with @chungfu27. The paper described the model like this, "word-by-word attention based on all output vectors of the hypothesis (h7, h8 and h9)", but in the codes, I think you get the attention only based on the last output vector.