keon / seq2seq

Minimal Seq2Seq model with Attention for Neural Machine Translation in PyTorch
MIT License
689 stars 172 forks source link

Why using relu to compute additaive attention #28

Open yuboona opened 4 years ago

yuboona commented 4 years ago

1、Attention's formula

score = v * tanh(W * [hidden; encoder_outputs])
score = v * relu(W * [hidden; encoder_outputs])

2、question

Is there some trick here? or this is a result after experimental comparision.