Bahdanau Attention 특징 정리

Dot Attention	Bahdanau Attention
$score(s{t},\ h{i}) = s^{T}{t}h{i}$	$score(s{t-1},\ H) = W{a}^{T}\ tanh(W{b}s{t-1}+W_{c}H)$
RNN cell 초기값을 encoder의 last hidden state로 설정한다	1) context vector를 input과 concat해서 RNN cell의 입력으로 보낸다
1) context vector를 s_t와 concat한 뒤 출력층 계산	2) 그렇게 나온 s_t와 context vector를 다시 concat해서 출력층 계산
Attention score 구하는 시점 : s_t-1	Attention score 구하는 시점 : s_t

*) context vector == attention value == softmax(attention score) Values
*) s_t == decoder output == decoder hidden state

상세한 내용은 "Pytorch 연습17"을 참조

vmtmxmf5 / Pytorch-