marian-nmt / marian

Fast Neural Machine Translation in C++
https://marian-nmt.github.io
Other
1.22k stars 228 forks source link

Padding batched inputs #327

Closed sshleifer closed 4 years ago

sshleifer commented 4 years ago

Do marian_decoder or marian_train use padding to make batches on inputs a fixed shape? If so, which token/id is used to indicate padding?

frankseide commented 4 years ago

We use padding, but it is not represented by a specific value but rather a mask.

emjotde commented 4 years ago

The masking is generally able to take care that padding symbols do not get propagated. We do however fill the batch with the same id as the </s> (end of sentence) symbol which is vocabulary dependent. </s> does not receive gradients during training in its padding positions.

sshleifer commented 4 years ago

Makes sense, thanks!