Hi, I just run the following code from the README and get "RuntimeError: CUDA error: out of memory". The GPU is NVIDIA Quadro RTX 8000 with 48 GB.
python main.py --ni --config imagenet_256.yml --doc imagenet --timesteps 20 --eta 0.85 --etaB 1 --deg sr4 --sigma_0 0.05
ERROR - main.py - 2023-01-04 15:05:00,845 - Traceback (most recent call last):
File "E:/leisen-workspace/codelife/super-resolution/ddrm-master/main.py", line 164, in main
runner.sample()
File "E:\leisen-workspace\codelife\super-resolution\ddrm-master\runners\diffusion.py", line 163, in sample
self.sample_sequence(model, cls_fn)
File "E:\leisen-workspace\codelife\super-resolution\ddrm-master\runners\diffusion.py", line 310, in sample_sequence
x, _ = self.sample_image(x, model, H_funcs, y_0, sigma_0, last=False, cls_fn=cls_fn, classes=classes)
File "E:\leisen-workspace\codelife\super-resolution\ddrm-master\runners\diffusion.py", line 338, in sample_image
etaB=self.args.etaB, etaA=self.args.eta, etaC=self.args.eta, cls_fn=cls_fn, classes=classes)
File "E:\leisen-workspace\codelife\super-resolution\ddrm-master\functions\denoising.py", line 53, in efficient_generalized_steps
et = model(xt, t)
File "C:\ProgramData\Anaconda3\envs\transenet\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "C:\ProgramData\Anaconda3\envs\transenet\lib\site-packages\torch\nn\parallel\data_parallel.py", line 166, in forward
return self.module(*inputs[0], **kwargs[0])
File "C:\ProgramData\Anaconda3\envs\transenet\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "E:\leisen-workspace\codelife\super-resolution\ddrm-master\guided_diffusion\unet.py", line 657, in forward
h = module(h, emb)
File "C:\ProgramData\Anaconda3\envs\transenet\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "E:\leisen-workspace\codelife\super-resolution\ddrm-master\guided_diffusion\unet.py", line 75, in forward
x = layer(x, emb)
File "C:\ProgramData\Anaconda3\envs\transenet\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "E:\leisen-workspace\codelife\super-resolution\ddrm-master\guided_diffusion\unet.py", line 233, in forward
self._forward, (x, emb), self.parameters(), self.use_checkpoint
File "E:\leisen-workspace\codelife\super-resolution\ddrm-master\guided_diffusion\nn.py", line 139, in checkpoint
return func(*inputs)
File "E:\leisen-workspace\codelife\super-resolution\ddrm-master\guided_diffusion\unet.py", line 242, in _forward
h = in_conv(h)
File "C:\ProgramData\Anaconda3\envs\transenet\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "C:\ProgramData\Anaconda3\envs\transenet\lib\site-packages\torch\nn\modules\conv.py", line 446, in forward
return self._conv_forward(input, self.weight, self.bias)
File "C:\ProgramData\Anaconda3\envs\transenet\lib\site-packages\torch\nn\modules\conv.py", line 443, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
My CUDA out ot memory error was solved by going into the config yaml file and reducing the batch_size, although was a slightly different error. Running 68GB GPU.
Hi, I just run the following code from the README and get "RuntimeError: CUDA error: out of memory". The GPU is NVIDIA Quadro RTX 8000 with 48 GB.
python main.py --ni --config imagenet_256.yml --doc imagenet --timesteps 20 --eta 0.85 --etaB 1 --deg sr4 --sigma_0 0.05
Do you know why this error happens?