mlcommons / GaNDLF

A generalizable application framework for segmentation, regression, and classification using PyTorch
https://gandlf.org
Apache License 2.0
166 stars 80 forks source link

[BUG] Classification/regression does not seem to work when `patch_size` is smaller than image size #924

Closed sarthakpati closed 1 month ago

sarthakpati commented 3 months ago

Describe the bug

When the image size is larger than the patch_size, GaNDLF tries to extract multiple patches and perform classification for each, after which it takes an average to generate the final prediction [ref].

To Reproduce

Steps to reproduce the behavior:

  1. Use the BraTS data and try to predict any 2 modalities (you should have a single channel_0 header apart from subjectid and a valuetopredict header).
  2. Try to perform classification using the base testing config [ref]. Update patch_size to [128,128,64].
  3. See error (copied from @Linardos):
    Traceback (most recent call last):
    File "FeTS_Challenge_RecEng.py", line 1161, in <module>
    restore_from_checkpoint_folder = restore_from_checkpoint_folder)
    File "/home/locolinux2/FETS2024/Challenge/Task_1/fets_challenge/experiment.py", line 457, in run_challenge_experiment
    collaborators[col].run_simulation()
    File "/home/locolinux2/FETS2024/Challenge/Task_1/venv/lib/python3.7/site-packages/openfl/component/collaborator/collaborator.py", line 170, in run_simulation
    self.do_task(task, round_number)
    File "/home/locolinux2/FETS2024/Challenge/Task_1/venv/lib/python3.7/site-packages/openfl/component/collaborator/collaborator.py", line 259, in do_task
    **kwargs)
    File "/home/locolinux2/.local/workspace/src/fets_challenge_model.py", line 48, in validate
    mode="validation")
    File "/home/locolinux2/FETS2024/Challenge/Task_1/venv/lib/python3.7/site-packages/GANDLF/compute/forward_pass.py", line 154, in validate_network
    for patch in generator:
    File "/home/locolinux2/FETS2024/Challenge/Task_1/venv/lib/python3.7/site-packages/torchio/data/sampler/weighted.py", line 65, in _generate_patches
    probability_map = self.get_probability_map(subject)
    File "/home/locolinux2/FETS2024/Challenge/Task_1/venv/lib/python3.7/site-packages/torchio/data/sampler/label.py", line 93, in get_probability_map
    label_map_tensor = self.get_probability_map_image(subject).data.float()
    File "/home/locolinux2/FETS2024/Challenge/Task_1/venv/lib/python3.7/site-packages/torchio/data/sampler/label.py", line 81, in get_probability_map_image
    raise RuntimeError(message)
    RuntimeError: No label maps found in subject Subject(Keys: ('value_value_0', '1'); images: 1) with image paths [[PosixPath('/home/locolinux2/datasets/MICCAI_FeTS2022_TrainingData/FeTS2022_01283/FeTS2022_01283_t1.nii.gz')]]

Expected behavior

The classification should work.

Media

N.A.

Environment information

This is using an older version of GaNDLF [ref], but we want to ensure that the error is either reproducible in the current master or completely resolved.

Additional context

Related to the FeTS Challenge.

Linardos commented 3 months ago

the relevant part of the config yaml file (plan.yaml in FeTS Challenge) to reproduce this bug is this:

task_runner :
  template : src.fets_challenge_model.FeTSChallengeModel
  settings :
    train_csv           : cla_test_train.csv
    val_csv             : cla_test_val.csv
    device              : cpu
    fets_config_dict  :
      problem_type: classification
      batch_size: 1
      data_augmentation: {}
      data_postprocessing: {}
      data_preprocessing:
        resize_image: [128,128,64]
        normalize: null
      in_memory: false
      learning_rate: 0.001
      loss_function: cel
      metrics:
        - cel
        - classification_accuracy
        - f1: {
            average: weighted,
          }
        - accuracy
        - balanced_accuracy
        - precision: {
            average: weighted,
          }
        - recall
        - iou: {
            reduction: sum,
          }

      model:
        amp: false
        architecture: vgg16
        base_filters: 32
        norm_type: batch
        class_list:
        - 0
        - 1
        - 2
        - 3
        dimension: 2
        final_layer: softmax
        type: torch
      nested_training:
        testing: -5
        validation: -5
      num_epochs: 1
      optimizer: 
        type: adam
      parallel_compute_command: ''
      patch_sampler: uniform
      enable_padding: True
      verbose: False
      patch_size: 
      - 128
      - 128
      - 64
      patience: 1
      q_max_length: 1
      q_num_workers: 0
      q_samples_per_volume: 1
      q_verbose: false
      save_output: false
      scaling_factor: 1
      scheduler:
        {
          type: triangle,
          min_lr: 0.00001,
          max_lr: 1,
        }
      version:
        maximum: 0.0.14
        minimum: 0.0.13
      weighted_loss: False
      which_model: resunet
      pin_memory_dataloader: false
      print_rgb_label_warning: true
      save_training: false
      scaling_factor: 1
      output_dir: "/home/locolinux2/FETS2024/Challenge/Task_1/temp"
      medcam_enabled: False
scap3yvt commented 3 months ago

Please assign to me.

scap3yvt commented 3 months ago

A few notes:

sarthakpati commented 3 months ago

I used the following csv:

SubjectID,Channel_0,ValueToPredict
1,C:/Projects/GaNDLF/testing/data/2d_rad_segmentation/001/image.png,0
2,C:/Projects/GaNDLF/testing/data/2d_rad_segmentation/002/image.png,1
3,C:/Projects/GaNDLF/testing/data/2d_rad_segmentation/003/image.png,2
4,C:/Projects/GaNDLF/testing/data/2d_rad_segmentation/004/image.png,2
5,C:/Projects/GaNDLF/testing/data/2d_rad_segmentation/005/image.png,0
6,C:/Projects/GaNDLF/testing/data/2d_rad_segmentation/006/image.png,0
7,C:/Projects/GaNDLF/testing/data/2d_rad_segmentation/007/image.png,1
8,C:/Projects/GaNDLF/testing/data/2d_rad_segmentation/008/image.png,0
9,C:/Projects/GaNDLF/testing/data/2d_rad_segmentation/009/image.png,1
10,C:/Projects/GaNDLF/testing/data/2d_rad_segmentation/010/image.png,1

And passed that instead of what the test expects for the 2d classification test [ref], and the current code seems to be working as expected. As in, multiple predictions were done for the image on which the final result was needed, and the average was done appropriately.

github-actions[bot] commented 1 month ago

Stale issue message