nmt/utils/iterator_utils.py:
# Create a tgt_input prefixed with <sos> and a tgt_output suffixed with <eos>.
Why do we do this? Why splitting desired output?
I know that
we use target_output only when computing loss, and
we use target_input as input to BasicDecoder through embedding and TrainingHelper
As I understood, we want output decoded sequence to start with sosand end with eos, and there is only one case where decoding stops on eos - if we use BeamSearchDecoder (does it stop right before this token?), which we don't use during training.
Also:
During training I never see that generated sequence contains eos, but it has sos tokens generated after ~each sentence. I use BeamSearchDecoder during inference, and it never stops on eos tokens, so I fastfixed it to stop on sos tokens, which is weird.
Hello!
nmt/utils/iterator_utils.py:
# Create a tgt_input prefixed with <sos> and a tgt_output suffixed with <eos>.
Why do we do this? Why splitting desired output? I know that
target_output
only when computing loss, andtarget_input
as input toBasicDecoder
through embedding andTrainingHelper
As I understood, we want output decoded sequence to start with
sos
and end witheos
, and there is only one case where decoding stops oneos
- if we useBeamSearchDecoder
(does it stop right before this token?), which we don't use during training.Also: During training I never see that generated sequence contains
eos
, but it hassos
tokens generated after ~each sentence. I useBeamSearchDecoder
during inference, and it never stops oneos
tokens, so I fastfixed it to stop onsos
tokens, which is weird.