jazzsaxmafia / show_and_tell.tensorflow

BSD 2-Clause "Simplified" License
291 stars 92 forks source link

What mask variable used for? #6

Open athenspeterlong opened 8 years ago

athenspeterlong commented 8 years ago

Hello,

great job for implementing the paper and thanks! However, I've got a question for the 'mask' variable? What does it used for in LSTM? in the LSTM equations I do not see any related variable?

Thanks for your help!

jazzsaxmafia commented 8 years ago

Hello,

mask variable is used because every sentence has different length. Let's say the size of a minibatch is 3, and the length of each sentence (number of words) is 10, 6, 3, then the LSTM has to run for 10 time steps because of the longest sentence. For the sentences with shorter length, I made a mask that has (10, 6, 3) ones for each row and zeros elsewhere, and mask the result after LSTM encoding. This masking technique is used for all kinds of RNN applications.

Thank you. -Taeksoo

shaoxuan92 commented 7 years ago

Thank you! Sooo much helpful..