Open prashanth31 opened 3 years ago
I was able to train my network by using the CPU instead of the GPU. It took a lot longer but at least it got the job done.
hey @prashanth31, I was wondering, how did you get it to run on the GPU? what's the command I should use?
I ended up running the cpu only version. Takes a lot of time but at least works.
On Wed, Apr 7, 2021, 3:28 PM Ankur Mahto @.***> wrote:
hey @prashanth31 https://github.com/prashanth31, I was wondering, how did you get it to run on the GPU? what's the command I should use?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tamarott/SinGAN/issues/144#issuecomment-815168333, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA333D5YRJ2ICQYPKQP4L7LTHSW4VANCNFSM4ZQSB7KA .
I had a similar issue. I was processing a 1024 pixel image (-max_size = 1024) and at about Scale 11, it crashed with the CUDA memory error. I have gone back to 512. The compute node being used is: https://www.nvidia.com/en-gb/geforce/graphics-cards/geforce-gtx-1080-ti/specifications/
@metaphorz How did you go back 512 and where is code for fix? please! than you
This is so long ago I've forgotten. Been using Stable Diffusion through A1111 for most software runs.
Can someone help me how to solve the "CUDA out of memory" error ? I think it has to do something with reducing the batch size but I am not sure where in the code I can do that. Here is the full error message
Traceback (most recent call last): File "main_train.py", line 29, in
train(opt, Gs, Zs, reals, NoiseAmp)
File "c:\Projects\PK\Phd\Paper4_GAN\SinGAN-master\SinGAN\training.py", line 39, in train
z_curr,in_s,G_curr = train_single_scale(D_curr,G_curr,reals,Gs,Zs,in_s,NoiseAmp,opt)
File "c:\Projects\PK\Phd\Paper4_GAN\SinGAN-master\SinGAN\training.py", line 162, in train_single_scale
gradient_penalty.backward()
File "c:\ProgramData\Anaconda3\envs\torch\lib\site-packages\torch\tensor.py", line 195, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "c:\ProgramData\Anaconda3\envs\torch\lib\site-packages\torch\autograd__init__.py", line 99, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: CUDA out of memory. Tried to allocate 30.00 MiB (GPU 0; 2.00 GiB total capacity; 1.16 GiB already allocated; 18.86 MiB free; 1.28 GiB reserved in total by PyTorch)