tbepler / topaz

Pipeline for particle picking in cryo-electron microscopy images using convolutional neural networks trained from positive and unlabeled examples. Also featuring micrograph and tomogram denoising with DNNs.
GNU General Public License v3.0
170 stars 62 forks source link

topaz denoise crashes with RuntimeError: CUDA out of memory #145

Closed maxxrenner closed 2 years ago

maxxrenner commented 2 years ago

Hi all,

I am trying to denoize a micrograph stack with the following command (using an RTX3060 with 12GB memory):

topaz denoise CM_15.mrc --model unet --device 0 --format mrc --stack --normalize --output denoised.mrc

Which produces the following error:

using device=0 with cuda=True Loading model: unet denoising stack with shape: (27, 4096, 4096) UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1640811805959/work/torch/csrc/utils/tensor_numpy.cpp:189.) Traceback (most recent call last): File "/home/max/Software/eman2-sphire-sparx/envs/topaz/bin/topaz", line 11, in load_entry_point('topaz-em==0.2.4', 'console_scripts', 'topaz')() File "/home/max/Software/eman2-sphire-sparx/envs/topaz/lib/python3.6/site-packages/topaz/main.py", line 148, in main args.func(args) File "/home/max/Software/eman2-sphire-sparx/envs/topaz/lib/python3.6/site-packages/topaz/commands/denoise.py", line 516, in main , use_cuda=use_cuda File "/home/max/Software/eman2-sphire-sparx/envs/topaz/lib/python3.6/site-packages/topaz/commands/denoise.py", line 292, in denoise_image mic += dn.denoise(model, x, patch_size=patch_size, padding=padding) File "/home/max/Software/eman2-sphire-sparx/envs/topaz/lib/python3.6/site-packages/topaz/denoise.py", line 72, in denoise y = model(x).squeeze() File "/home/max/Software/eman2-sphire-sparx/envs/topaz/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/max/Software/eman2-sphire-sparx/envs/topaz/lib/python3.6/site-packages/topaz/denoise.py", line 514, in forward h = F.interpolate(h, size=(n,m), mode='nearest') File "/home/max/Software/eman2-sphire-sparx/envs/topaz/lib/python3.6/site-packages/torch/nn/functional.py", line 3712, in interpolate return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors) RuntimeError: CUDA out of memory. Tried to allocate 6.00 GiB (GPU 0; 11.74 GiB total capacity; 2.57 GiB already allocated; 3.49 GiB free; 6.13 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Is there any workaround? Help would be greatly appreciated :)

Thanks, Max

tbepler commented 2 years ago

Hi Max, it sounds like you have multiple things running on our GPU and therefore it does not have enough GPU RAM available for topaz denoise.

I recommend making sure nothing else is running on the GPU when you run topaz denoise. You can also process the micrographs in patches to reduce the GPU RAM required.

maxxrenner commented 2 years ago

Hi Tristan,

Thanks for the quick reply! Apologies, I should have specified... to the best of my knowledge there is nothing (significant) running on my GPU. Here is the output of nvidia-smi:


+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.54       Driver Version: 510.54       CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:09:00.0  On |                  N/A |
|  0%   32C    P8    15W / 170W |    267MiB / 12288MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1098      G   /usr/lib/xorg/Xorg                101MiB |
|    0   N/A  N/A      1717      G   /usr/lib/xorg/Xorg                109MiB |
|    0   N/A  N/A      1874      G   /usr/bin/gnome-shell               28MiB |
|    0   N/A  N/A      2231      G   ...mviewer/tv_bin/TeamViewer       14MiB |
+-----------------------------------------------------------------------------+

How do I process in patches?

Thanks!

tbepler commented 2 years ago

You can use the -s/--patch-size flag. It's set to 1024 by default, so I recommend trying something smaller like 512.

maxxrenner commented 2 years ago

Thanks!


From: Tristan Bepler @.***> Sent: Friday, April 8, 2022 3:54:02 PM To: tbepler/topaz Cc: Renner, Max; Author Subject: Re: [tbepler/topaz] topaz denoise crashes with RuntimeError: CUDA out of memory (Issue #145)

You can use the -s/--patch-size flag. It's set to 1024 by default, so I recommend trying something smaller like 512.

— Reply to this email directly, view it on GitHubhttps://github.com/tbepler/topaz/issues/145#issuecomment-1092884527, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AYSUB3WFKKHURO66Y4YGBKLVEA27VANCNFSM5SVG37MQ. You are receiving this because you authored the thread.Message ID: @.***>