Kyubyong / transformer

A TensorFlow Implementation of the Transformer: Attention Is All You Need
Apache License 2.0
4.28k stars 1.3k forks source link

something about Model.py , in which something wrong between encoder and decoder #163

Open glorymu opened 4 years ago

glorymu commented 4 years ago

model.py
def train():
line140 memory, sents1, src_masks = self.encode(xs) line141 logits, preds, y, sents2 = self.decode(ys, memory, src_masks)

we know the memory is the last block's output of the encoder ,but author directly send the output into the decoder structure, so every block in the decoder use the last memory as K and V . Obviously it'wrong ,we should take every block's ouputs into a list[]. then send them to the corresponding block in the decoder as memory. friends, who can tell me ,am i right?

GuoshenLi commented 3 years ago

............. omg you are totally wrong.... the author is right.

glorymu commented 3 years ago

嗯,我知道,我两种方法后来都尝试了,作者的效果会好一些,我的方式会快一些。

------------------ 原始邮件 ------------------ 发件人: "Kyubyong/transformer" @.>; 发送时间: 2021年7月18日(星期天) 中午12:03 @.>; @.**@.>; 主题: Re: [Kyubyong/transformer] something about Model.py , in which something wrong between encoder and decoder (#163)

............. omg you are totally wrong.... the author is right.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.