Closed rzcwade closed 5 years ago
For the input of prediction network, there is no need to keep blank
, so the vector should be sized vocab_size - 1
. Meanwhile, the output includes blank
.
Just like the <eos>
in sequence-to-sequence training, keep it or not doesn't matter.
Thank you for clarifying this! It's all clear now :)
Hi @HawkAaron ,
I don't quite understand why you have vocab_size-1 in your model.py greedy_decode code line 70: y = mx.nd.zeros((1, 1, self.vocab_size-1)) # first zero vector Could you tell me what vocab you're excluding here?
Thanks!