CUDA Error: GPU out of memory with batch_size = 1.

alexgkendall / SegNet-Tutorial

Files for a tutorial to train SegNet for road scenes using the CamVid dataset

http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html

851 stars 518 forks source link

CUDA Error: GPU out of memory with batch_size = 1. #25

Closed dongleecsu closed 8 years ago

dongleecsu commented 8 years ago

Hi @alexgkendall , thank you so much for your guide. I followed the instructions on http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html to train the SegNet. But when I opened up a terminal and ran the commands: ./SegNet/caffe-segnet/build/tools/caffe train -gpu 0 -solver /SegNet/Models/segnet_basic_solver.prototxt An error occurred: batch_size_1_error It seems that the GPU is out of memory(batch_size = 1, so Memory required for data: 410926132). So I checked the GPU with the command: nvidia-smi Result: smi My GPU is GT 720 with 1G memory. Though the memory is small, it is much bigger than 245MB + 410MB(data required memory above) = 655MB.

So I would like to ask your advice for this issue :-) Thank you!

alexgkendall commented 8 years ago

I believe the data required (410MB) is just for the weights/data. Caffe requires additional memory for the solver and gradient parameters. You might struggle on a 1GB GPU I'm afraid!

dongleecsu commented 8 years ago

Thank you @alexgkendall . I tried on another GPU and figured it out.

rshanor commented 8 years ago

@dongleecsu was the GPU your problem? I am trying to run the examples on a 4GB GTX980 with no luck yet, also setting the batch size to 1.

dongleecsu commented 8 years ago

@rshanor were there any other programs using the GPU at the same time? You can check the GPU memory usage with nvidia-smi. I tried to run the Segnet basic(batch_size=1) on a 12G Titan, and everything was OK.

rshanor commented 8 years ago

@dongleecsu thanks. I was finally able to get some results by downsizing the images. Curiously, with 240x180 images, I can run with a batch size of 7...

tsingjinyun commented 8 years ago

@rshanor ，if you use 4GB GTX980 && batch:=1, you can restart computer and just run this code ,it maybe ok, i run into this problem

mrgloom commented 8 years ago

Same https://github.com/alexgkendall/caffe-segnet/issues/21

salehiac commented 8 years ago

@rshanor Did you make additional modifications after resizing the images? With 240x180 images I get an error upsample_layer.cpp:63] Check failed: bottom[0]->height() == bottom[1]->height() (23 vs. 12)

mitalbert commented 7 years ago

@LordOfAshes you should change upsample_w and upsample_h in your prototxt training file.