[1706.03762] Attention Is All You Need - Githubissues

ftnext / MLPaperSummary

論文のサマリーをIssueに蓄積（arXivTimesリスペクト）

0 stars 0 forks source link

[1706.03762] Attention Is All You Need #12

Open ftnext opened 1 year ago

ftnext commented 1 year ago

https://scrapbox.io/nikkie-memos/Attention_Is_All_You_Need

まとめ

機械翻訳タスクで使われていたAttentionを拡張し、Transformerというモデルアーキテクチャを提案した。 TransformerではRNNやCNNレイヤーを用いないことで、並列化を実現している（ref: 『ゼロから作るDeep Learning ❷』）

Multi-Head Attention（Scaled dot-product attentionをhead分）
Self Attention（Query, Key, Valueいずれも同じ）

Figure 1: encoder-decoderで構成されるTransformer