Memory error in UNET prediction with patches

mcblache commented 2 months ago

Hello,

I used the Biapy GUI to train an UNET (Semantic Segmentation template) on 2D with RGB images (slide scanner image). I would like to predict a large image (1.5GB). However, Biapy do a memory error:

  File "/installations/miniconda3/envs/BiaPy_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/installations/BiaPy/biapy/models/blocks.py", line 53, in forward
    out = self.block(x)
  File "/installations/miniconda3/envs/BiaPy_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/installations/miniconda3/envs/BiaPy_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/installations/miniconda3/envs/BiaPy_env/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
    input = module(input)
  File "/installations/miniconda3/envs/BiaPy_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/installations/miniconda3/envs/BiaPy_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/installations/miniconda3/envs/BiaPy_env/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 460, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/installations/miniconda3/envs/BiaPy_env/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 456, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 15.36 GiB. GPU 0 has a total capacity of 23.67 GiB of which 10.27 GiB is free. Process 62791 has 12.94 GiB memory in use. Of the allocated memory 11.53 GiB is allocated by PyTorch, and 16.53 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
ERROR conda.cli.main_run:execute(124): `conda run python3 -u /installations/BiaPy/main.py --config /BiaPy_files/input.yaml --result_dir /home/mcblache/prj/assises/output --name my_experiment_2D_20240424-094401 --run_id 1 --dist_backend nccl --gpu 0` failed. (See above for error)

Note that the process 62791 is biapy himself.

I don't understand. This image is smaller than the training images.

I do patches 512X512. When I put FULL-IMG = False it works but I do not have the prediction of all the images. Can you help me please?

Thanks

Marie-Claire

PS: Here are my settings :

AUGMENTOR:
  ENABLE: false
DATA:
  EXTRACT_RANDOM_PATCH: false
  FORCE_RGB: true
  PATCH_SIZE: (512,512,3)
  REFLECT_TO_COMPLETE_SHAPE: true
  TEST:
    ARGMAX_TO_OUTPUT: true
    CHECK_DATA: true
    GT_PATH: /home/mcblache/prj/assises/data/masks/test
    IN_MEMORY: true
    LOAD_GT: true
    OVERLAP: (0,0)
    PADDING: (32,32)
    PATH: /home/mcblache/prj/assises/data/img/test
    RESOLUTION: (1,1)
MODEL:
  ARCHITECTURE: unet
  DROPOUT_VALUES:
  - 0.0
  - 0.0
  - 0.0
  - 0.0
  - 0.0
  FEATURE_MAPS:
  - 16
  - 32
  - 64
  - 128
  - 256
PROBLEM:
  NDIM: 2D
  SEMANTIC_SEG:
    IGNORE_CLASS_ID: '0'
  TYPE: SEMANTIC_SEG
SYSTEM:
  NUM_CPUS: -1
  NUM_WORKERS: 0
  SEED: 0
TEST:
  ENABLE: true
  EVALUATE: true
  FULL_IMG: true
  VERBOSE: true
TRAIN:
  ENABLE: false

danifranco commented 2 months ago

Hello,

When FULL_IMG is enable BiaPy will try to feed the entire image into the GPU, that's why it's crashing. Disable it to predict the image patch by patch (I'd recommend to increase TEST.PADDING too, e.g. (100,100) to have more smoother output), and if it finishes without errors the images should be in a folder called "per_image" (please check the section in our semantic segmentation doc to see the folders that are created on that workflow).

danifranco commented 1 month ago

Did you try it again with my recommendations?

mcblache commented 1 month ago

Hello,

Thank you very much for your answer. Sorry, I expressed myself incorrectly. In the "per image" folder, the image is not complete. For example, the initial image of size 17879 X 28292 pixels has a prediction image of size 17879 X 17879 pixels. Maybe, am I making another configuration mistake?

Thank you

mcblache commented 1 month ago

Yes I tried and there are no errors when using your settings. Thank you.

danifranco commented 1 month ago

The prediction should be the same size as the input image. Can you please pull the last version of the code and infer again?

mcblache commented 1 month ago

I used the BIAPY GUI with the version v3.3.12 - GUI V1.0.6 in this configuration the prediction image is not complete.
When I used a notebook with a new environment conda (biapy version 3.4.2) the prediction is OK.

Thank you very much for your answer

danifranco commented 1 month ago

We need to update the GUI to the new version 3.4.3 of BiaPy, where some bugs are already fixed. If you can use for the moment the notebook (recently updated all to 3.4.3) that's great. I will write you though the #4 issue so you can check the new version of the GUI when we finish it. I'm closing this.

BiaPyX / BiaPy

Memory error in UNET prediction with patches #81