Open abramhindle opened 9 years ago
Yes, I wouldn't be surprised, theanets doesn't try to do any memory management at all, so it's up to Python/Theano to clean up things that have disappeared from the active set. There's probably a bunch that could be done within theanets to help with this, though.
Hi, I have a feeling that layerwise optimizer, by creating numerous networks is not freeing past networks and using more GPU memory than it should. I'm having a heck of time doing layerwise training
With this network:
With the following pretraining:
I get the following error after training on layer hid1 and hid2 once it tries to train on hid3 it borks at validation.
Yet if I just do training it works fine. It does use a lot of GPU memory, it's a big network and I have a lot of training examples.
My theory is that shared variables and whatnot are not being freed appropriately. I was looking at the code and new layers are being created but I cannot tell how much sharing or copying is being done.