Open shamanez opened 7 years ago
This is indeed a missing feature and I'll try to add the TF dynamic padding feature. It should definitely help the performances and speed up the training. However, I don't think it will increase the accuracy. The inputs are reversed so padding don't affect the encoder.
What do you mean by reversed inputs ? Is that about the input sequence ? But I 'v red in a paper that by keeping fixed size inputs we can loose some information with long sequences. Because if our original input is only two word sequence here we have to add 8 padding symbols. So isn't that highlighting some unwanted information. Can it be fade read information ?
It may be true if you add the padding to the right of your original sequence (input - padding - decoder), but I don't see why that would be the case when the padding is added to the left (padding - input - decoder).
So it's like there will be no information passing with the padding when we get it as (In-Pd-De)? Because 0 symbols don't pass anything ? Yeah then it's correct.
Will it increase accuracy since padding for too long can reduce information.