Closed bmcfee closed 6 years ago
Some recommendations from @pli1988:
Good overview and references plus it's short. http://ruder.io/deep-learning-nlp-best-practices/index.html#attention
More of a survey of attention...a bit old at this point, but good https://arxiv.org/pdf/1507.01053.pdf
Next Level Attention https://arxiv.org/abs/1706.03762
I think I have a couple more candidates lying around somewhere, will post later today, then would be good for those familiar with this literature to help guide the choice of which papers/blogs to focus on.
I think for a first run, either a survey or an early paper would be best. We can cover the more recent stuff later on.
I vote for Describing Multimedia Content using Attention-based Encoder–Decoder Networks
There's also the "original" attention paper:
Neural Machine Translation by Jointly Learning to Align and Translate https://arxiv.org/abs/1409.0473
Without having read the papers, I'm inclined to agree with @mcartwright that we should start with: "Describing Multimedia Content using Attention-based Encoder–Decoder Networks" https://arxiv.org/pdf/1507.01053.pdf
Unless there are any contenders, shall we go with this paper?
Just saw there's a bunch of attention paper suggestions on #2. Pasting them here, @bmcfee @mcartwright @pli1988 shall we stick to this issue for attention papers?
@bmcfee: Chiu and Raffel 17: https://arxiv.org/abs/1712.05382 Mnih 14: http://papers.nips.cc/paper/5542-recurrent-models-of-visual-attention Gregor 15: https://arxiv.org/abs/1502.04623 (aka DRAW) Xu 15: http://proceedings.mlr.press/v37/xuc15.html (show-attend-tell) Arandjelovic 17: https://arxiv.org/abs/1705.08168 (look-listen-learn)
@pli1988: I really like this survey. It starts with seq2seq first which I think is important. https://arxiv.org/pdf/1507.01053.pdf This blog covers a bunch of different flavors of attention and is easy to read. http://ruder.io/deep-learning-nlp-best-practices/index.html#attention Attention is all you need. https://arxiv.org/pdf/1706.03762.pdf
It still seems to me like the Cho's survey paper (Describing Multimedia Content using Attention-based Encoder–Decoder Networks) would be the best starting point - any comments?
SGTM. Do you have an alternative link to the pdf? I'm getting a 403 error.
NM, looks like it was a temporary arxiv hiccup. @justinsalamon can you send the paper out to the group?
Attention papers? @justinsalamon