Closed zhangxiangnick closed 7 years ago
I see in the visualization that the q-value you are obtaining for each action is 'nan', and for this reason the action is always the same, the first one. The question is why you are obtaining nan... I can show you an example of visualization, you should see the estimated q-values related to each action, the final action selected, the mask of the region and the image region:
I see. You are right, this step gives nan
qval = model.predict(state.T, batch_size=1)
I found np.sum(state) is nan for me, state is the default beginning state. Then I realized in the get_state() function,
descriptor_image = get_conv_image_descriptor_for_image(image, model_vgg)
gives nan at some elements.
maybe you should check if you are loading the vgg16 weights properly, I downloaded them from the following source: https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3
Thanks @miriambellver ! After some investigations, I realized the problem is in this Theano function in get_conv_image_descriptor_for_image()
_convout1_f = K.function(inputs, [model.layers[31].output])
which produces nan.
To make it work, I actually switched to VGG16 from keras.applications and select the cooresponding Pool5 layer. Then the Theano function won't produce nan. This would require some changes of the requirement.txt as well.
If other people have the same issuees, I can submit all the details.
I used the pre-trained image-zooms model and tested on VOC2007 test. It runs properly, however, it seems that the agent just always take the same action ...
example:
I used the default config in image_zooms_testing.py: