Error trying to run inference

refrantz commented 11 months ago

Hello, I'm trying to run inference by executing the main.py with a configuration file based on the gen-inference-unet in the configs folder, I'm pointing my dataset to a folder with the images I want to segment in .npy format. I get the following error:

INFO:root:loading preprocessing
Traceback (most recent call last):
  File "/home/renan/anaconda3/envs/alveolar_canal/lib/python3.9/site-packages/munch/__init__.py", line 103, in __getattr__
    return object.__getattribute__(self, k)
AttributeError: 'Munch' object has no attribute 'preprocessing'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/renan/anaconda3/envs/alveolar_canal/lib/python3.9/site-packages/munch/__init__.py", line 106, in __getattr__
    return self[k]
KeyError: 'preprocessing'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/renan/alveolar_canal/main.py", line 92, in <module>
    if config.data_loader.preprocessing is None:
  File "/home/renan/anaconda3/envs/alveolar_canal/lib/python3.9/site-packages/munch/__init__.py", line 108, in __getattr__
    raise AttributeError(k)
AttributeError: preprocessing

looking at the code there seems to be a section which should handle none preprocessing fields, but it doesn't seem to be working, here is my yml:

# title of the experiment
title: canal_generator_train
# Where to output everything, in this path a folder with
# the same name as the title is created containing checkpoints,
# logs and a copy of the config used
project_dir: './results'
seed: 47

# which experiment to execute: Segmentation or Generation
experiment:
  name: Segmentation

data_loader:
  dataset: ./data/MG_scan_test.nii.gz
  # null to use training_set, generated to used the generated dataset
  training_set: null
  # which augmentations to use, see: augmentations.yaml
  augmentations: configs/augmentations.yaml
  background_suppression: 0
  batch_size: 2
  labels:
    BACKGROUND: 0
    INSIDE: 1
  mean: 0.08435
  num_workers: 8
  # shape of a single patch
  patch_shape:
  - 120
  - 120
  - 120
  # reshape of the whole volume before extracting the patches
  resize_shape:
  - 168
  - 280
  - 360
  sampler_type: grid
  grid_overlap: 0
  std: 0.17885
  volumes_max: 2100
  volumes_min: 0
  weights:
  - 0.000703
  - 0.999

# which network to use
model:
  name: PosPadUNet3D

loss:
  name: Jaccard

lr_scheduler:
  name: Plateau

optimizer:
  learning_rate: 0.1
  name: SGD

trainer:
  # Reload the last checkpoints?
  reload: False
  checkpoint: ./checkpoints/seg-checkpoint.pth
  # train the network
  do_train: False
  # do a single test of the network with the loaded checkpoints
  do_test: False
  # generate the synthetic dense dataset
  do_inference: True
  epochs: 100

Any help is appreciated, thanks.

LucaLumetti commented 11 months ago

Hi Renan, Thank you for pointing that out. The problem was raised because you removed the preprocessing: while it should support setting preprocessing: None, it does not expect it to be removed. Please take a look at how I handle this at this line.

I suggest you set preprocessing: None (or better, I would suggest employing a basic preprocessing like the one provided as it greatly improves the network performance).

Moreover, I see that you set dataset: ./data/MG_scan_test.nii.gz. This will not work, the code does not expect a .nii.gz file as a dataset, instead, it expects a folder that contains a splits.json file and at least 1 patient folder with a data.npy and gt_alpha.npy files. The folder structure supported is the one you can see at https://ditto.ing.unimore.it/maxillo. For more in-depth information, you can look at how I do load the dataset here. If you would like to employ a custom dataset (which can be a single file), you have to write a dataset class by yourself which allows it to be iterated and provides a numpy array. The code I have written for the Maxillo dataset is a good example, and you can have a look at it to implement your own. Take also a look at the inference method to check how I used TorchIO to load the data. Here you will have to substitute Maxillo() with your custom dataset class.

Of course, if you have further questions, feel free to ask.

Luca

refrantz commented 11 months ago

Thanks for the quick answer, I had realised the mistake on the data part and will probably just redo the inference code so it can just take one file and receive the parameters through the command line. As for the pre-processing, thanks for the tip, but I can't seem to find the pre-processing field at all in the recommended experiment.yml template?

# title of the experiment
title: canal_generator_train
# Where to output everything, in this path a folder with
# the same name as the title is created containing checkpoints,
# logs and a copy of the config used
project_dir: '/path/to/results'
seed: 47

# which experiment to execute: Segmentation or Generation
experiment:
  name: Generation

data_loader:
  dataset: /path/to/maxillo
  # null to use training_set, generated to used the generated dataset
  training_set: null
  # which augmentations to use, see: augmentations.yaml
  augmentations: configs/augmentations.yaml
  background_suppression: 0
  batch_size: 2
  labels:
    BACKGROUND: 0
    INSIDE: 1
  mean: 0.08435
  num_workers: 8
  # shape of a single patch
  patch_shape:
  - 120
  - 120
  - 120
  # reshape of the whole volume before extracting the patches
  resize_shape:
  - 168
  - 280
  - 360
  sampler_type: grid
  grid_overlap: 0
  std: 0.17885
  volumes_max: 2100
  volumes_min: 0
  weights:
  - 0.000703
  - 0.999

# which network to use
model:
  name: PosPadUNet3D

loss:
  name: Jaccard

lr_scheduler:
  name: Plateau

optimizer:
  learning_rate: 0.1
  name: SGD

trainer:
  # Reload the last checkpoints?
  reload: True
  checkpoint: /path/to/checkpoints/last.pth
  # train the network
  do_train: True
  # do a single test of the network with the loaded checkpoints
  do_test: False
  # generate the synthetic dense dataset
  do_inference: False
  epochs: 100

that is the yml posted on the github page, is it supposed to have a preprocessing field? It is not included in gen-inference-unet.yaml either:

title: canal_generator_train
project_dir: '/homes/llumetti/results'
seed: 47

experiment:
  name: Generation

data_loader:
  dataset: /nas/softechict-nas-1/llumetti/maxillo
  training_set: null
  augmentations: configs/augmentations.yaml
  background_suppression: 0
  batch_size: 2
  labels:
    BACKGROUND: 0
    INSIDE: 1
  mean: 0.08435
  num_workers: 8
  patch_shape:
  - 120
  - 120
  - 120
  resize_shape:
  - 168
  - 280
  - 360
  sampler_type: grid
  grid_overlap: 0
  std: 0.17885
  volumes_max: 2100
  volumes_min: 0
  weights:
  - 0.000703
  - 0.999

model:
  name: PosPadUNet3D

loss:
  name: Jaccard

lr_scheduler:
  name: Plateau

optimizer:
  learning_rate: 0.1
  name: SGD

trainer:
  reload: False
  checkpoint: '/homes/llumetti/results/canal_generator_train_4E4E6BAC96/checkpoints/best.pth'
  do_train: False
  do_test: False
  do_inference: True
  epochs: 100

LucaLumetti commented 11 months ago

That is my bad, I missed it while creating the template for some reason. You can find it at gen-training-unet.yaml. Basically, you have to add this line in the config file: preprocessing: configs/preprocessing.yaml In the preprocessing.yaml file you can see that it includes a clip between 0 and 2100 and then it rescales everything in the 0-1 float range. I have updated the template in the README.md

As always, if you need any further questions, feel free to ask. Otherwise, I'll close the issue.

refrantz commented 11 months ago

That was it, thanks for the help.

AImageLab-zip / alveolar_canal

Error trying to run inference #12