Hi @kracwarlock, sorry to bother you again. I open here another issue related to you comment
I see that I did not release the multi-layer LSTM code. I will try to do that as soon as I have time. Till then this is how it is done https://github.com/kelvinxu/arctic-captions/blob/master/capgen.py#L542-L548. In the paper the X means the feature of a single sample. In the code everything is done on a batch.
I tried the way you suggested, but soon realized that it cannot work: in fact, by simply replicating the lstm_cond_layer, you also replicate the theano.scan which iterates over the n_steps. It seems to me that this prevents the upper lstm layer to provide the location sofmax to the lower one at each time step. I reckon the multiple layers should be implemented inside the single theano.scan instance.
Could you please either comment on that or provide the original code of the multilayer LSTM?
Thanks in advance!
Hi @kracwarlock, sorry to bother you again. I open here another issue related to you comment
I tried the way you suggested, but soon realized that it cannot work: in fact, by simply replicating the
lstm_cond_layer
, you also replicate thetheano.scan
which iterates over then_steps
. It seems to me that this prevents the upper lstm layer to provide the location sofmax to the lower one at each time step. I reckon the multiple layers should be implemented inside the singletheano.scan
instance.Could you please either comment on that or provide the original code of the multilayer LSTM? Thanks in advance!