jiasenlu / AdaptiveAttention

Implementation of "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning"
https://arxiv.org/abs/1612.01887
Other
334 stars 74 forks source link

Some confusion about adaptive attention model #8

Closed a380922457 closed 6 years ago

a380922457 commented 7 years ago

According to the paper "Knowing When to Look", LSTM only receive the word vector Xt and the previous hidden state Ht-1,instead of the image vector,but your code includes the image vector when building the LSTM. Would you please explain it ? Thank you very much

jiasenlu commented 6 years ago

In the implementation details part, we mention that we feed the image features as input xt.