Closed terryrabinowitz closed 8 years ago
Hello Terry, I tried to reproduce this issue but it seems to work on my end (though it does take longer to compile, so that might be related to #24). In any case, I think it could be helpful if you could post the full error output.
Good morning and thanks for the reply.
Here is the error output: error_out.txt
And if it makes things easier, here is my script: deep_FANTOM.txt
Terry
I was not able to reproduce this error with your code unfortunately. As I don't have access to your data files, I used arbitrary data + tensor shapes and it works fine. The error mentions a TensorType with shape (0, 0, 0)
, maybe the problem comes from the way you create trainInput
and trainLabels
.
...'Non-unit value on shape on a broadcastable dimension.', (0, 0, 0),
(False, True, False), 'Container name "None"')
However I think there is an error in the way you define the model (though I am not sure the original issue is related to it). Once you apply the ReshapeLayer
after the NTMLayer
, the corresponding tensor will have shape (batch_size * seq_len, num_memory)
. After multiple DenseLayer
& DropoutLayer
, your output will have shape (batch_size * seq_len, cell_len)
. That means that your first dimension does not match batch_size
anymore. One way to solve this problem could be to add a ReshapeLayer
after l_out
l_out = layers.ReshapeLayer(l_out, (batch_size, seq_len, cell_len))
This, of course, depends on the task you want to solve. This fix is a sequence input-sequence output task, like the copy-task.py
example. If you want to do sequence input-scalar output (which may be what you were looking for), have a look at the only_return_final
argument in NTMLayer
. Here is a (very minimal) example that works on my end: https://gist.github.com/tristandeleu/ab4b815d184a8dca273e2f66e3f98687
I did see that error with reshaping and removed the line but that had no effect on the error after I tried again.
I will keep looking and try it with different data as soon as I am able.
Thanks for your help and I will take a look at your sample code, Terry
Hello again. This is embarrassing but I may have found the error and I wanted to run it by you before I start making major changes:
Does NTM require python 2.7.8 exactly or >= 2.7.8?
The computing cluster I am using recently upgraded to python 2.7.12 and I am getting the same error I describe above when I try using your examples.
Oh that's right, I originally tested the above script in python 2.7.10. I tried to run it on python 2.7.12 with the latest versions of Theano and Lasagne, but it works as well.
Do you still get the error when you replace the NTMLayer
by another recurrent layer (say LSTMLayer
, with the same number of hidden units)?
I just got NTM to work! I reinstalled Theano and Lasagne, but instead of using the bleeding edge versions, I simply typed 'pip install theano' and 'pip install lasagne' and now the original errors are all gone.
That is surprising that the bleeding edge versions are not working, but all in all I'm glad you got it sorted! :)
Hello there! Thank you for the code and I was hoping that you could assist me. I am getting an error resulting from this line in my code (the bold text):
train_model = th.function(inputs=[index], outputs=[train_loss], updates = train_updates, givens={input_var:trainInput[index * batch_size: (index +1) * batch_size], target_var:trainLabels[index * batch_size: (index+1) * batch_size]})
When I skip the NTM layer in my model, there are no problems and everything works fine. In addition, when I remove the "updates = train_updates" from the above line, errors go away.
I have successfully used NTM in the past but not since I have changed my data to be theano shared variables and to use the givens=... part of the theano function.
Thank you. I can provide the error output if requested. Terry