PANXiao1994 / mRASP2

120 stars 25 forks source link

src tokens fed twice into the encoder? #14

Closed Mao-KU closed 2 years ago

Mao-KU commented 2 years ago

Hi,

In the criterion script of constructing the contrastive loss, one line is: https://github.com/PANXiao1994/mRASP2/blob/36c17003dcd642affbe8290c8f26231fec77794a/mcolt/criterions/label_smoothed_cross_entropy_with_contrastive.py#L50

for this, [src tokens -> encoder] -> decoder -> output

another line: https://github.com/PANXiao1994/mRASP2/blob/36c17003dcd642affbe8290c8f26231fec77794a/mcolt/criterions/label_smoothed_cross_entropy_with_contrastive.py#L52

for this, src tokens -> encoder -> encoder output, which is the same as the part in [ ] above

It seems you feed the src tokens 2 times into the encoder, although the loss computation will be right, will this decrease the training efficiency?

Or anything I missed? Thank you in advance.

PANXiao1994 commented 2 years ago

The src tokens are truely fed twice. This is for the purpose of the compilation for original fairseq code. Actually the same computation is calculated twice in Eager mode. But in Graph mode the redundant part will be automatically optimized.