Swall0w / papers

This is a repository for summarizing papers especially related to machine learning.
65 stars 7 forks source link

You May Not Need Attention #705

Open Swall0w opened 5 years ago

Swall0w commented 5 years ago

Ofir Press, Noah A. Smith

In NMT, how far can we get without attention and without separate encoding and decoding? To answer that question, we introduce a recurrent neural translation model that does not use attention and does not have a separate encoder and decoder. Our eager translation model is low-latency, writing target tokens as soon as it reads the first source token, and uses constant memory during decoding. It performs on par with the standard attention-based model of Bahdanau et al. (2014), and better on long sentences.

https://arxiv.org/abs/1810.13409

Swall0w commented 5 years ago

https://github.com/ofirpress/YouMayNotNeedAttention