marl / group_meetings

Notes and ideas for MARL group meetings
10 stars 0 forks source link

2018-02-06 Plan #3

Closed bmcfee closed 6 years ago

bmcfee commented 6 years ago

Attention papers? @justinsalamon

justinsalamon commented 6 years ago

Some recommendations from @pli1988:

Good overview and references plus it's short. http://ruder.io/deep-learning-nlp-best-practices/index.html#attention

More of a survey of attention...a bit old at this point, but good https://arxiv.org/pdf/1507.01053.pdf

Next Level Attention https://arxiv.org/abs/1706.03762

I think I have a couple more candidates lying around somewhere, will post later today, then would be good for those familiar with this literature to help guide the choice of which papers/blogs to focus on.

bmcfee commented 6 years ago

I think for a first run, either a survey or an early paper would be best. We can cover the more recent stuff later on.

mcartwright commented 6 years ago

I vote for Describing Multimedia Content using Attention-based Encoder–Decoder Networks

justinsalamon commented 6 years ago

There's also the "original" attention paper:

Neural Machine Translation by Jointly Learning to Align and Translate https://arxiv.org/abs/1409.0473

Without having read the papers, I'm inclined to agree with @mcartwright that we should start with: "Describing Multimedia Content using Attention-based Encoder–Decoder Networks" https://arxiv.org/pdf/1507.01053.pdf

Unless there are any contenders, shall we go with this paper?

justinsalamon commented 6 years ago

Just saw there's a bunch of attention paper suggestions on #2. Pasting them here, @bmcfee @mcartwright @pli1988 shall we stick to this issue for attention papers?

@bmcfee: Chiu and Raffel 17: https://arxiv.org/abs/1712.05382 Mnih 14: http://papers.nips.cc/paper/5542-recurrent-models-of-visual-attention Gregor 15: https://arxiv.org/abs/1502.04623 (aka DRAW) Xu 15: http://proceedings.mlr.press/v37/xuc15.html (show-attend-tell) Arandjelovic 17: https://arxiv.org/abs/1705.08168 (look-listen-learn)

@pli1988: I really like this survey. It starts with seq2seq first which I think is important. https://arxiv.org/pdf/1507.01053.pdf This blog covers a bunch of different flavors of attention and is easy to read. http://ruder.io/deep-learning-nlp-best-practices/index.html#attention Attention is all you need. https://arxiv.org/pdf/1706.03762.pdf

It still seems to me like the Cho's survey paper (Describing Multimedia Content using Attention-based Encoder–Decoder Networks) would be the best starting point - any comments?

bmcfee commented 6 years ago

SGTM. Do you have an alternative link to the pdf? I'm getting a 403 error.

bmcfee commented 6 years ago

NM, looks like it was a temporary arxiv hiccup. @justinsalamon can you send the paper out to the group?