faustomilletari / VNet

GNU General Public License v3.0
287 stars 122 forks source link

ValueError while training with batch size 4 #42

Open devareddy opened 7 years ago

devareddy commented 7 years ago

Hi, am facing following errors while training. Any help appreciated.

1. For batch size 1 and 2 with cuDNN is installed F0612 08:31:44.348748 13429 syncedmem.cpp:51] Check failed: error == cudaSuccess (2 vs. 0) out of memory Check failure stack trace: Aborted (core dumped)

2. For batch size 4 File "main.py", line 37, in model.train() File "/home/user/3dprostrate/VNet/VNet.py", line 163, in train self.trainThread(dataQueue, solver) File "/home/user/3dprostrate/VNet/VNet.py", line 80, in trainThread solver.net.blobs['data'].data[...] = batchData.astype(dtype=np.float32) ValueError: could not broadcast input array from shape (4,1,128,128,64) into shape (2,1,128,128,64)

Thanks in advance -D

elitap commented 7 years ago
  1. Seems like you dont have enough GPU Memory (e.g. type nvidia-smi in your comand line and have a look, for a batch size of 2 you need ~8Gb)
  2. You have to adapt the input data blob in caffe as well to run the Network with a batch size of 4. In your train_noPooling_ResNet_cinque.prototxt line 2 and 5 change the input dim to fit 4 Volumes. However, if you dont have enought memory to run the net with a batchsize of 2 you wont be able to run it with 4.

hth