-
-
## 論文リンク
https://arxiv.org/abs/1706.03762
## 公開日(yyyy/mm/dd)
2017/06/12
## 概要
Sequential なデータを扱うときに、recursive や convolutional な構造を使わずに、attention (と positional encoding) のみを使うことで学習時のコストが低くかつ高性…
-
**Abstract:**
> The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing models also conn…
-
## 0. 論文
https://arxiv.org/abs/1706.03762
[Ashish Vaswani](https://arxiv.org/search/cs?searchtype=author&query=Vaswani%2C+A), [Noam Shazeer](https://arxiv.org/search/cs?searchtype=author&query=S…
-
## URL
https://arxiv.org/abs/1706.03762
## date
17/01/2017
## Abstract
* RNNやCNNを使用せずに、注意機構(attention)だけを使用してState-of-the-Artを達成
ju-ki updated
2 years ago
-
## 一言でいうと
RNN/CNNを使わず翻訳のSOTAを達成した話。Attentionを基礎とした伝搬が肝となっている。単語/位置のlookupから入力を作成、Encoderは入力+前回出力からAを作成しその後位置ごとに伝搬、DecoderはEncoder出力+前回出力から同様に処理し出力している
![image](https://user-images.githubusercont…
-
## 집현전 중급반 스터디
- 2022년 6월 19일 일요일 9시
- 이영빈님 송이현님 원재성님 박세훈님 발표
- 논문 링크: https://arxiv.org/abs/1706.03762
> ### Abstract
> The dominant sequence transduction models are based on complex recurrent…
-
## 📝 Introduction
현재 다양한 NLP 모델의 근간이 되고 있는 Transformers 입니다.
## Why?
NLP에서 많이 다뤄지는 모델이기 때문입니다.
## Issue card
### 0. Tokenizer 학습
`NOTE`: 이 repo에서는 해당 내용은 이미 되어있다고 가정합니다.
### 1. D…
-
* https://arxiv.org/abs/1706.03762
* [BERT](https://github.com/chullhwan-song/Reading-Paper/issues/202)리뷰와 함께 읽어보는것도 좋을것같다(거의 동일)
-
## どんなもの
### 論文
[Attention Is All You Need](https://arxiv.org/pdf/1706.03762.pdf)
### 著者・所属機関
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kais…