Open tethys0221 opened 6 years ago
This implementation of CapsNet (If not every other as well) is quite heavy in terms of memory overhead for weights/optimizer gradients.
You're supplying a 128 x 3872 x 196 x16 element batch input, as float32 values would take about 6.2 GB of memory, considering optimizer overhead I could see that being some multiple of that.
Does it run with a batch size of 1? Try 10 and see what difference it makes in memory usage, and work out your highest usable batch size like that
I'm trying to train this with images that are 227*227 and have 196 types. I wrote my own load_data() but when training, error "Resource exhausted, OOM when allocating tensor with shape [128,3872,196,16,1]" occurs. (128 is batch size)