Closed andmax closed 5 years ago
It seems that you're trying to use a batch size of 16 -- did you modify the data loading code to make your volume shape [1, 16, 160, 192, 224]? It should be [1, 160, 192, 224, 1] for each batch.
Or, is this an error from an intermediate tensorflow computation? If so, even a batch size of 1 might be too large to fit on a 1080 Ti -- it barely fits on a Titan X, which has 12 GB of memory. I would recommend using a larger GPU if possible. If not, you might need to work with smaller (e.g. downsampled) volumes.
Hi Amy, thank you for your answer. I am running your main.py script with default parameters, which I believe sets batch size to 1. Also, the network architectures have been printed out, so I guess it is from an intermediate tensorflow computation. I was wandering which GPU you use, but it seems that it was a Titan X with 12GB. That is good to know. :) My GPU (1080 Ti) also has 12GB, so there must be something in the code to change in order for it to run.
My understanding is that the 1080 Ti has slightly less memory available than the Titan X -- what does nvidia-smi say? Mine has me using 11557MiB of 12196MiB.
Mine has total of 11178MiB and goes out of memory. That additional ~400MB maybe the problem. Thanks Amy!
Glad we sorted it out! If you don't have another GPU available, it might be worth trying to change your default float precision in your keras.json to float16: https://keras.io/backend/#kerasjson-details.
Thanks for pointing this issue out! I'll add a note about GPU memory to the readme.
The GPU is "GeForce GTX 1080 Ti" got the out-of-memory (OOM) error running:
$ python3 main.py trans --gpu 0 --data mri-100unlabeled --model flow-fwd
Error:
OOM when allocating tensor with shape[1,16,160,192,224] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc