RuntimeError: CUDA out of memory.

prashanth31 commented 3 years ago

Can someone help me how to solve the "CUDA out of memory" error ? I think it has to do something with reducing the batch size but I am not sure where in the code I can do that. Here is the full error message

Traceback (most recent call last): File "main_train.py", line 29, in train(opt, Gs, Zs, reals, NoiseAmp) File "c:\Projects\PK\Phd\Paper4_GAN\SinGAN-master\SinGAN\training.py", line 39, in train z_curr,in_s,G_curr = train_single_scale(D_curr,G_curr,reals,Gs,Zs,in_s,NoiseAmp,opt) File "c:\Projects\PK\Phd\Paper4_GAN\SinGAN-master\SinGAN\training.py", line 162, in train_single_scale gradient_penalty.backward() File "c:\ProgramData\Anaconda3\envs\torch\lib\site-packages\torch\tensor.py", line 195, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "c:\ProgramData\Anaconda3\envs\torch\lib\site-packages\torch\autograd__init__.py", line 99, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: CUDA out of memory. Tried to allocate 30.00 MiB (GPU 0; 2.00 GiB total capacity; 1.16 GiB already allocated; 18.86 MiB free; 1.28 GiB reserved in total by PyTorch)

prashanth31 commented 3 years ago

I was able to train my network by using the CPU instead of the GPU. It took a lot longer but at least it got the job done.

ankuroo commented 3 years ago

hey @prashanth31, I was wondering, how did you get it to run on the GPU? what's the command I should use?

prashanth31 commented 3 years ago

I ended up running the cpu only version. Takes a lot of time but at least works.

On Wed, Apr 7, 2021, 3:28 PM Ankur Mahto @.***> wrote:

hey @prashanth31 https://github.com/prashanth31, I was wondering, how did you get it to run on the GPU? what's the command I should use?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tamarott/SinGAN/issues/144#issuecomment-815168333, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA333D5YRJ2ICQYPKQP4L7LTHSW4VANCNFSM4ZQSB7KA .

metaphorz commented 3 years ago

I had a similar issue. I was processing a 1024 pixel image (-max_size = 1024) and at about Scale 11, it crashed with the CUDA memory error. I have gone back to 512. The compute node being used is: https://www.nvidia.com/en-gb/geforce/graphics-cards/geforce-gtx-1080-ti/specifications/

vuhungtvt2018 commented 1 year ago

@metaphorz How did you go back 512 and where is code for fix? please! than you

metaphorz commented 1 year ago

This is so long ago I've forgotten. Been using Stable Diffusion through A1111 for most software runs.

tamarott / SinGAN

RuntimeError: CUDA out of memory. #144