snipsco / ntm-lasagne

Neural Turing Machines library in Theano with Lasagne
https://medium.com/snips-ai/ntm-lasagne-a-library-for-neural-turing-machines-in-lasagne-2cdce6837315#.63t84s5r5
MIT License
300 stars 51 forks source link

Theano endless compilation #24

Closed opocaj92 closed 8 years ago

opocaj92 commented 8 years ago

Hi, I'm trying to use your library to test some simple NTMs. I'd like to start with your task-copy problem, just to see how the library works a little, so I've run your code without modification on my Ubuntu 16.04 64-bit Virtual Machine (I'm using Python 2.7.11 and the latest versions of NumPy, Theano and Lasagne). The problem is that when it comes for Theano to compile the three functions defined before the training process it takes forever. I've started compiling them at 5.00pm, and at 4.00am the process didn't end yet. It's only a problem of mine, or maybe I'm doing something wrong...

tristandeleu commented 8 years ago

It does take a long time to compile; on my laptop it takes a few minutes to compile (which is already quite long compared to other models). However I have never experienced anything like this before unfortunately. I'm not sure if it is related to this particular issue, but I will try to replace the non-optimal operations (like nested scan) with their equivalent introduced with Theano 0.8. That should shave off a bit of compilation time.

opocaj92 commented 8 years ago

I've just tried to compile some Theano code of mine in order to see if there was a problem with my Theano setup, but it works normally... When trying to run your example, I've copied the file task-copy.py from the examples folder to the root ntm-lasagne folder of the git cloned repo (in order to use the utils.* functions) and then simply started it without any modification. But my VM began to slow down a lot, terminal window (look like is) freeze and nothing happens. I've tried to break down your example into parts and run them separately. I've no problem generating the Task object, nor creating the NTM using your library + Lasagne (so this is not a wrong installation problem), but when I add one of the three lines in which you compile a Theano function it stop working. Strange...

tristandeleu commented 8 years ago

What makes it particularly long to compile is the gradient computation. Maybe you could try to compile the prediction function alone (called ntm_fn in the example) to see if it works

ntm_fn = theano.function([input_var], pred_var)
opocaj92 commented 8 years ago

Yes, I'm able to compile both ntm_fn and ntm_layer_fn, also in the same execution (ntm_fn requires some minutes, while ntm_layer_fn is quicker to compile), so the problem must be the gradient computation in train_fn. I've already used theano.tensor.grad() in the code I used for testing Theano this morning and it requires only around a minute to compile, so I know it works in easy situations (I was testing a small 3-layer AutoEncoder), maybe there are only too many params to compile on my virtual machine. But it's strange that you need only minutes and to me 8 hours weren't enough...

opocaj92 commented 8 years ago

Ok, I solved the issue, it was simply the memory usage. I leaved the compilation to run for about 12 hours yesterday, and at the end I've received an [OSError 12] cannot allocate memory message, so this morning I've increased the RAM of my VM and now everything seems to work correctly. It only requires some minutes, as you previously said. Thanks for the help!

tristandeleu commented 8 years ago

Indeed it is quite memory intensive. I am glad you found out the reason! I will put a note in the README to acknowledge that, thank you for pointing this out!