Closed kobenaxie closed 5 years ago
We rely on the tf.data.Dataset.padded_batch
method to do the paddings.
padded_batch
is used here, in the function that creates the Dataset
and the Iterator
MASK
and END
are special tokens that I occasionally found in the predicted ids.
I cannot remember exactly why they appeared there, but my best guess is the beam search decoder. We kept them in the unit dictionary just for convenience, so that the reverse lookup to graphemes would be error free. In any case, these symbols are removed here before the prediction is written to a file.
Hi, I can not find the code for padding, could you point out please ? And what is the meaning of [
MASK
,END
] in unit dict, where used it