kracwarlock / action-recognition-visual-attention

Action recognition using soft attention based deep recurrent neural networks
http://www.cs.toronto.edu/~shikhar/projects/action-recognition-attention
351 stars 158 forks source link

Multi-layer LSTM #14

Open pmorerio opened 8 years ago

pmorerio commented 8 years ago

Hi @kracwarlock, sorry to bother you again. I open here another issue related to you comment

I see that I did not release the multi-layer LSTM code. I will try to do that as soon as I have time. Till then this is how it is done https://github.com/kelvinxu/arctic-captions/blob/master/capgen.py#L542-L548. In the paper the X means the feature of a single sample. In the code everything is done on a batch.

I tried the way you suggested, but soon realized that it cannot work: in fact, by simply replicating the lstm_cond_layer, you also replicate the theano.scan which iterates over the n_steps. It seems to me that this prevents the upper lstm layer to provide the location sofmax to the lower one at each time step. I reckon the multiple layers should be implemented inside the single theano.scan instance.

Could you please either comment on that or provide the original code of the multilayer LSTM? Thanks in advance!