jadore801120 / attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".
MIT License
8.78k stars 1.97k forks source link

Why decoding is needed during inference ? #160

Open rajeevbaalwan opened 4 years ago

rajeevbaalwan commented 4 years ago

I can understand masking is necessary for decoder during training to hide future tokens from decoder but why do we need masking during inference in decoder ?