[transformers] Where is context_Wq used?

marian-nmt / marian

Fast Neural Machine Translation in C++

https://marian-nmt.github.io

Other

1.22k stars 228 forks source link

[transformers] Where is context_Wq used? #326

Closed sshleifer closed 4 years ago

sshleifer commented 4 years ago

I am new to C++ and trying to port some of the trained translation models to python. I ran into a few questions: 1) Where in the code are the parameters like "context_Wq" used? 2) Where is the forward pass of the model when decoder is called? 3) Is the decoder of the seq2seq model also bert-like?

sshleifer commented 4 years ago

I think I answered 1 and 2 https://github.com/marian-nmt/marian/blob/master/src/models/transformer.h#L770 Which suggests that the decoder has the same data flow as the encoder besides cross attention.

emjotde commented 4 years ago

That is correct.