ymli39 / DeepSEED-3D-ConvNets-for-Pulmonary-Nodule-Detection

DeepSEED: 3D Squeeze-and-Excitation Encoder-Decoder ConvNets for Pulmonary Nodule Detection
MIT License
109 stars 33 forks source link

out of memory error #47

Closed fastfishing closed 1 year ago

fastfishing commented 1 year ago

Hello, thank you for your selfless sharing. I had an out-of-memory error while running the code (I was using rtx3090-24G) : enter the directive {set CUDA_VISIBLE_DEVICES=0 python train_detector_se.py -b 2 --save-dir /train_result/ --epochs 100}

The run reports the following error: { torch.Size([18, 1, 208, 208, 208]) Traceback (most recent call last): File "D:\FISH\DeepSEED_1\luna_detector\train_detector_se.py", line 410, in main() File "D:\FISH\DeepSEED_1\luna_detector\train_detector_se.py", line 158, in main test(test_loader, net, get_pbb, save_dir,config) File "D:\FISH\DeepSEED_1\luna_detector\train_detector_se.py", line 355, in test output = net(input,inputcoord) File "C:\Users\yiwen\anaconda3\envs\LiuJH\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "C:\Users\yiwen\anaconda3\envs\LiuJH\lib\site-packages\torch\nn\parallel\data_parallel.py", line 169, in forward return self.module(*inputs[0], *kwargs[0]) File "C:\Users\yiwen\anaconda3\envs\LiuJH\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "D:\FISH\DeepSEED_1\luna_detector\res18_se.py", line 107, in forward out = self.preBlock(x)#16 File "C:\Users\yiwen\anaconda3\envs\LiuJH\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "C:\Users\yiwen\anaconda3\envs\LiuJH\lib\site-packages\torch\nn\modules\container.py", line 217, in forward input = module(input) File "C:\Users\yiwen\anaconda3\envs\LiuJH\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(args, **kwargs) File "C:\Users\yiwen\anaconda3\envs\LiuJH\lib\site-packages\torch\nn\modules\batchnorm.py", line 171, in forward return F.batch_norm( File "C:\Users\yiwen\anaconda3\envs\LiuJH\lib\site-packages\torch\nn\functional.py", line 2470, in batch_norm return torch.batch_norm( The torch. Cuda. OutOfMemoryError: cuda out of memory. Tried to the allocate 6.44 GiB (GPU 0; 24.00 GiB total capacity; 19.61 GiB already allocated; 2.94 GiB free; 19.63 GiB reserve d in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF}

This problem has troubled me for a long time. I tried to modify the input data and adjust the network structure, but failed to solve this problem due to my limited ability.Could you help me with this problem,thanks a lot.

ymli39 commented 1 year ago

Hi, this error is due to being out of memory. Your input size is torch.Size([18, 1, 208, 208, 208]), which means you have a batch size of 18. You can try to reduce it to 2 instead of 18.