I am new to keras and deep learning in general. I am trying to implementation Visual Attention based Image Caption generation based on Xu et. al I have created a new class AttentionLSTM based on the existing LSTM class. I want to retrieve the value of one of the states (alpha - the weights of features vectors), however whenever I access it (at the end of each batch), it is always comes up as an all-zero tensor. My model is as follows:
To get the alpha value, I have defined the following function:
alphaz = aLstm_Layer.states[3]
alpha_func = K.function([x_inp, z_inp, z_mean], alphaz)
al = alpha_func(x_train)
print(al)
I am setting alpha to zero in reset_states() and get_initial_states().
Am I doing something wrong (with the model or the way I retrieve alpha) ? Is there a better way to get the value of layer.states ? (I am doing this because I don't know if there's a way to make a layer give multiple outputs)
[x] Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
[x] If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
[x] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).
I am new to keras and deep learning in general. I am trying to implementation Visual Attention based Image Caption generation based on Xu et. al I have created a new class AttentionLSTM based on the existing LSTM class. I want to retrieve the value of one of the states (alpha - the weights of features vectors), however whenever I access it (at the end of each batch), it is always comes up as an all-zero tensor. My model is as follows:
My attention has the following code in
step
functionTo get the alpha value, I have defined the following function: alphaz = aLstm_Layer.states[3] alpha_func = K.function([x_inp, z_inp, z_mean], alphaz) al = alpha_func(x_train) print(al)
The above print statement always returns
b'CudaNdarray([[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]])'
I am setting alpha to zero in
reset_states()
andget_initial_states()
.Am I doing something wrong (with the model or the way I retrieve alpha) ? Is there a better way to get the value of
layer.states
? (I am doing this because I don't know if there's a way to make a layer give multiple outputs)[x] Check that you are up-to-date with the master branch of Keras. You can update with: pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
[x] If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with: pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
[x] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).