jacquelinelala / GFN

Gated Fusion Network for Joint Image Deblurring and Super-Resolution(BMVC 2018 Oral)
http://xinyizhang.tech/bmvc2018/
141 stars 39 forks source link

RuntimeError: CUDA error: out of memory #17

Open akhilvinvent opened 5 years ago

akhilvinvent commented 5 years ago

I am facing issue RuntimeError: CUDA error: out of memory. I had follow the instructions.

Traceback (most recent call last): File "test_GFN_4x.py", line 130, in model_test(model) File "test_GFN_4x.py", line 100, in model_test test(testloader, model, criterion, SR_dir) File "test_GFN_4x.py", line 77, in test [lr_deblur, sr] = model(LR_Blur, gated_Tensor, test_Tensor) File "D:\InstalledApps\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call result = self.forward(*input, kwargs) File "D:\Work\VInventTechWork\Biop.ai\Deblurry\GFN\networks\GFN_4x.py", line 216, in forward recon_out = self.reconstructMoudle(fusion_feature) File "D:\InstalledApps\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call result = self.forward(*input, *kwargs) File "D:\InstalledApps\Anaconda3\lib\site-packages\torch\nn\modules\container.py", line 91, in forward input = module(input) File "D:\InstalledApps\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call result = self.forward(input, kwargs) File "D:\Work\VInventTechWork\Biop.ai\Deblurry\GFN\networks\GFN_4x.py", line 186, in forward pixelshuffle1 = self.relu1(self.pixelShuffle1(con1)) File "D:\InstalledApps\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call result = self.forward(*input, **kwargs) File "D:\InstalledApps\Anaconda3\lib\site-packages\torch\nn\modules\pixelshuffle.py", line 40, in forward return F.pixel_shuffle(input, self.upscale_factor) File "D:\InstalledApps\Anaconda3\lib\site-packages\torch\nn\functional.py", line 1844, in pixel_shuffle shuffle_out = input_view.permute(0, 1, 4, 2, 5, 3).contiguous() RuntimeError: CUDA error: out of memory

Booooooooooo commented 3 years ago

I have found that in

https://github.com/jacquelinelala/GFN/blob/3b80f530d9a04964fb80300b08267ae2d4c78753/train_GFN_4x.py#L102 epoch_loss += mse will accumulate gradients of all iterations and finally cause CUDA out of memory. Use epoch_loss += mse.item() instead will solve this problem.