snipsco / ntm-lasagne

Neural Turing Machines library in Theano with Lasagne
https://medium.com/snips-ai/ntm-lasagne-a-library-for-neural-turing-machines-in-lasagne-2cdce6837315#.63t84s5r5
MIT License
300 stars 51 forks source link

Trouble with Updates in Theano Function #28

Closed terryrabinowitz closed 8 years ago

terryrabinowitz commented 8 years ago

Hello there! Thank you for the code and I was hoping that you could assist me. I am getting an error resulting from this line in my code (the bold text):

train_model = th.function(inputs=[index], outputs=[train_loss], updates = train_updates, givens={input_var:trainInput[index * batch_size: (index +1) * batch_size], target_var:trainLabels[index * batch_size: (index+1) * batch_size]})

When I skip the NTM layer in my model, there are no problems and everything works fine. In addition, when I remove the "updates = train_updates" from the above line, errors go away.

I have successfully used NTM in the past but not since I have changed my data to be theano shared variables and to use the givens=... part of the theano function.

Thank you. I can provide the error output if requested. Terry

tristandeleu commented 8 years ago

Hello Terry, I tried to reproduce this issue but it seems to work on my end (though it does take longer to compile, so that might be related to #24). In any case, I think it could be helpful if you could post the full error output.

terryrabinowitz commented 8 years ago

Good morning and thanks for the reply.

Here is the error output: error_out.txt

And if it makes things easier, here is my script: deep_FANTOM.txt

Terry

tristandeleu commented 8 years ago

I was not able to reproduce this error with your code unfortunately. As I don't have access to your data files, I used arbitrary data + tensor shapes and it works fine. The error mentions a TensorType with shape (0, 0, 0), maybe the problem comes from the way you create trainInput and trainLabels.

...'Non-unit value on shape on a broadcastable dimension.', (0, 0, 0),
   (False, True, False), 'Container name "None"')

However I think there is an error in the way you define the model (though I am not sure the original issue is related to it). Once you apply the ReshapeLayer after the NTMLayer, the corresponding tensor will have shape (batch_size * seq_len, num_memory). After multiple DenseLayer& DropoutLayer, your output will have shape (batch_size * seq_len, cell_len). That means that your first dimension does not match batch_size anymore. One way to solve this problem could be to add a ReshapeLayer after l_out

l_out = layers.ReshapeLayer(l_out, (batch_size, seq_len, cell_len))

This, of course, depends on the task you want to solve. This fix is a sequence input-sequence output task, like the copy-task.py example. If you want to do sequence input-scalar output (which may be what you were looking for), have a look at the only_return_final argument in NTMLayer. Here is a (very minimal) example that works on my end: https://gist.github.com/tristandeleu/ab4b815d184a8dca273e2f66e3f98687

terryrabinowitz commented 8 years ago

I did see that error with reshaping and removed the line but that had no effect on the error after I tried again.

I will keep looking and try it with different data as soon as I am able.

Thanks for your help and I will take a look at your sample code, Terry

terryrabinowitz commented 8 years ago

Hello again. This is embarrassing but I may have found the error and I wanted to run it by you before I start making major changes:

Does NTM require python 2.7.8 exactly or >= 2.7.8?

The computing cluster I am using recently upgraded to python 2.7.12 and I am getting the same error I describe above when I try using your examples.

tristandeleu commented 8 years ago

Oh that's right, I originally tested the above script in python 2.7.10. I tried to run it on python 2.7.12 with the latest versions of Theano and Lasagne, but it works as well.

Do you still get the error when you replace the NTMLayer by another recurrent layer (say LSTMLayer, with the same number of hidden units)?

terryrabinowitz commented 8 years ago

I just got NTM to work! I reinstalled Theano and Lasagne, but instead of using the bleeding edge versions, I simply typed 'pip install theano' and 'pip install lasagne' and now the original errors are all gone.

tristandeleu commented 8 years ago

That is surprising that the bleeding edge versions are not working, but all in all I'm glad you got it sorted! :)