short summary

RNNやCNNを使わずAttention機構だけを用いたTransformerというモデルの提案。翻訳において、SOTA（かつ計算コスト低い）を達成。また、他のタスクへの一般化を評価するために行なったconstituency parsing（構成素解析？）でも、設定は翻訳の時のベースモデルとほぼ一緒で既存モデルを上回った。他のタスクへの適応も期待できる。

参考

http://deeplearning.hatenablog.com/entry/transformer
https://www.slideshare.net/DeepLearningJP2016/dl-hacks-attention-is-all-you-need

URL

https://arxiv.org/pdf/1706.03762.pdf

author

Ashish Vaswani∗ Google Brain avaswani@google.com Noam Shazeer∗ Google Brain noam@google.com Niki Parmar∗ Google Research nikip@google.com Jakob Uszkoreit∗ Google Research usz@google.com Llion Jones∗ Google Research llion@google.com Aidan N. Gomez∗ † University of Toronto aidan@cs.toronto.edu Łukasz Kaiser∗ Google Brain lukaszkaiser@google.com Illia Polosukhin∗ ‡ illia.polosukhin@gmail.com

year

2017

kacky24 / papers

Attention Is All You Need #17

short summary

URL

author

year