Open athenspeterlong opened 8 years ago
Hello,
mask variable is used because every sentence has different length. Let's say the size of a minibatch is 3, and the length of each sentence (number of words) is 10, 6, 3, then the LSTM has to run for 10 time steps because of the longest sentence. For the sentences with shorter length, I made a mask that has (10, 6, 3) ones for each row and zeros elsewhere, and mask the result after LSTM encoding. This masking technique is used for all kinds of RNN applications.
Thank you. -Taeksoo
Thank you! Sooo much helpful..
Hello,
great job for implementing the paper and thanks! However, I've got a question for the 'mask' variable? What does it used for in LSTM? in the LSTM equations I do not see any related variable?
Thanks for your help!