faustomilletari / VNet

GNU General Public License v3.0
286 stars 122 forks source link

Check failed: error == cudaSuccess (2 vs. 0) out of memory #2

Closed tianzq closed 8 years ago

tianzq commented 8 years ago

My GPU memory is 8G. I already reduced the batch_size from 2 to 1. It still has an error as follows. Could you help me to fix it? Thanks.

I0802 17:56:31.832522 6075 net.cpp:482] Collecting Learning Rate and Weight Decay. I0802 17:56:31.832535 6075 net.cpp:247] Network initialization done. I0802 17:56:31.832540 6075 net.cpp:248] Memory required for data: 2010906628 I0802 17:56:31.832893 6075 solver.cpp:42] Solver scaffolding done. F0802 17:56:33.158094 6075 syncedmem.cpp:51] Check failed: error == cudaSuccess (2 vs. 0) out of memory * Check failure stack trace: * Aborted (core dumped)

faustomilletari commented 8 years ago

Hello, are you using 3d caffe from the other repository? I have been running this on gtx 1080 and titan x. I will try to reproduce the problem nevertheless. Which GPU are you using? Which version of CUDA and CuDNN?

tianzq commented 8 years ago

I used your 3D Caffe from "https://github.com/faustomilletari/3D-Caffe". My GPU is NVIDIA Quadro M5000. CUDA version is 7.5. CuDNN is not used.

faustomilletari commented 8 years ago

Hello, i'm afraid you need to use CuDNN V5 to run it. I think the non-cudnn implementation is not reliable and it will anyway take more memory. When 3d-caffe will be merged with caffe this will be solved. For now the best you can do is to get cudnn (since it actually helps with many other things). Refer to Nvidia website for that. Sorry for the inconvenience. Please let me know if cudnn solves the problem for you

tianzq commented 8 years ago

Thanks so much. I will try CuDNN and let you know the result.

tianzq commented 8 years ago

After installation of CuDNN, it works! I am training the model right now. Thanks so much for your help.

I have one question about your VNet. Does batch size affect the performance of prostate segmentation? Your batch size is 2, while my batch size is 1.

faustomilletari commented 8 years ago

In looking at the info now. I will let you know...

Fausto Milletarì Sent from my iPhone

On 03.08.2016, at 17:29, tianzq notifications@github.com wrote:

I installed CuDNN v5 for Linux. Then uncomment "USE_CUDNN:=1", "make clean", "make all", and "make pycaffe". It still has error. I attached two log files (with and without CuDNN) and nvidia-smi info. nvidia-smi.txt info_withoutCuDNN.txt info_withCuDNN.txt

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

faustomilletari commented 8 years ago

Another thing, are you sure you are running the correct caffe? Do you maybe have another caffe installation on the machine? Can you try to export the path of 3d caffe pycaffe in the PYTHONPATH?

Fausto Milletarì Sent from my iPhone

On 03.08.2016, at 18:53, Fausto Milletari fausto.milletari@gmail.com wrote:

In looking at the info now. I will let you know...

Fausto Milletarì Sent from my iPhone

On 03.08.2016, at 17:29, tianzq notifications@github.com wrote:

I installed CuDNN v5 for Linux. Then uncomment "USE_CUDNN:=1", "make clean", "make all", and "make pycaffe". It still has error. I attached two log files (with and without CuDNN) and nvidia-smi info. nvidia-smi.txt info_withoutCuDNN.txt info_withCuDNN.txt

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

faustomilletari commented 8 years ago

Hello, it would be best to use cmake. Clone the repository, create a directory called build, go there and run cmake .. Then make all and then make pycaffe and then make install. That should compile through..

Fausto Milletarì Sent from my iPhone

On 03.08.2016, at 17:29, tianzq notifications@github.com wrote:

I installed CuDNN v5 for Linux. Then uncomment "USE_CUDNN:=1", "make clean", "make all", and "make pycaffe". It still has error. I attached two log files (with and without CuDNN) and nvidia-smi info. nvidia-smi.txt info_withoutCuDNN.txt info_withCuDNN.txt

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

faustomilletari commented 8 years ago

I think you can have batch size 2 as well. I don't actually know right now. I guess it could affect the performance. At a certain point I should make available the pre -trained caffemodel BTW

Fausto Milletarì Sent from my iPhone

On 03.08.2016, at 19:25, tianzq notifications@github.com wrote:

After installation of CuDNN, it works! I am training the model right now. Thanks so much for your help.

I have one question about your VNet. Does batch size affect the performance of prostate segmentation? Your batch size is 2, while my batch size is 1.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

MiaPrice commented 6 years ago

@faustomilletari do you still have plans on making the pre-trained caffemodel available?

faustomilletari commented 6 years ago

No plans. Most probably there will be a brand new version of V-Net in pytorch that will be easier to train and use out of the box.

On Jul 9, 2018, at 2:24 PM, MiaPrice notifications@github.com wrote:

@faustomilletari https://github.com/faustomilletari do you still have plans on making the pre-trained caffemodel available?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/faustomilletari/VNet/issues/2#issuecomment-403625316, or mute the thread https://github.com/notifications/unsubscribe-auth/AMtsvlbC_mhhrHnxLQZLNoAKJ4zfjbiVks5uE8odgaJpZM4JbGGJ.