Reading: Sequence to Sequence Learning with Neural Networks

0. Paper

@incollection{NIPS2014_5346, title = {Sequence to Sequence Learning with Neural Networks}, author = {Sutskever, Ilya and Vinyals, Oriol and Le, Quoc V}, booktitle = {Advances in Neural Information Processing Systems 27}, editor = {Z. Ghahramani and M. Welling and C. Cortes and N. D. Lawrence and K. Q. Weinberger}, pages = {3104--3112}, year = {2014}, publisher = {Curran Associates, Inc.}, url = {http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf} } [link]

1. What is it?

They proposed an efficient way to learn sequence to sequence (seq2seq) model into the machine translation task.

2. What is amazing compared to previous studies?

They used 2 LSTM units for encoder and decoder.

3. Where is the key to technologies and techniques?

They want to calculate this probability, generating target sentence y from source sentence x. To calculate this probability, they used seq2seq models (this model is also Reccurent Neural Network; RNN). v is a hidden state of source sentence x in last state of encoder LSTM. Hidden state h and estimated word yt are calculated as below;

To train this model, objective function is below;

T is target sentences and S is source sentences, this set is called parallel corpus.

a1da4 / paper-survey