Multiple input volumes may need different normalization

che85 commented 8 months ago

The current implementation only supports one type of normalization for a whole input series. In some cases, you might want different normalization applied to individual input images or no normalization at all.

https://github.com/lassoan/SlicerMONAIAuto3DSeg/blob/497ae0b94518620712a7e5ed4ba618585cd34e2a/MONAIAuto3DSeg/Scripts/auto3dseg_segresnet_inference.py#L108

and

https://github.com/lassoan/SlicerMONAIAuto3DSeg/blob/497ae0b94518620712a7e5ed4ba618585cd34e2a/MONAIAuto3DSeg/Scripts/auto3dseg_segresnet_inference.py#L138-L151

lassoan commented 8 months ago

Could you propose some specific API changes? If not too much work then you could draft a pull request that implements that you would need.

che85 commented 8 months ago

I just looked into it. When looking at the model configuration saved in the model.pt of the Prostate - Multisequence:

...
'normalize_mode': 'meanstd',
...

Only one value is used for normalize_mode 'normalize_mode': 'meanstd'. I assume this is a global setting applied to all inputs.

Similar to the model I trained with the first image as type MRI and the second image input (label) as type none.

.yaml file for training with Auto3dSeg.

modality: mri
extra_modalities: {image2 : none}    # a second modality

The information should come directly from the config saved in the model.pt, and the number of values of normalize_mode should correspond to the number of input images used for training/inference as we don't have any additional information about the training configurations used.

che85 commented 8 months ago

In a different scenario, this could become a big problem (for example image1=MRI, image2=CT).

che85 commented 8 months ago

On the other hand, if it's too much work to change Auto3dSeg itself (which would probably the more future proof way), we could consider adding additional tags to the inputs tag of the model configuration. Something like this:

{
    "title": "Prostate - Multisequence",
    "description": "Segments peripheral (PZ) and transition zone (TZ) on T2 and ADC sequences. Trained on Medical Segmentation Decathlon dataset",
    "subject": "human",
    "imagingModality": "MR",
    "inputs": [
        {"title": "Input T2 volume", "modality": "mri"},
        {"title": "Input ADC volume", "modality": "mri"},
        {"title": "Input volume 3", "modality": "none"},
    ],
    "versions":
    [
        { "url": "https://github.com/lassoan/SlicerMONAIAuto3DSeg/releases/download/Models/prostate-v1.0.0.zip" }
    ]
},

if not provided, the global config setting from the model could be used.

@lassoan Any thoughts?

che85 commented 8 months ago

It's correctly done in MONAI's Auto3DSeg algorithm template which depends on the .yaml configuration file.

https://github.com/Project-MONAI/research-contributions/blob/9c95bff194e3cdb97970606ba47bd36160ec0ee2/auto3dseg/algorithm_templates/segresnet/scripts/segmenter.py#L286-L312

lassoan commented 8 months ago

Adding more metadata to the input description is not a problem at all (that's why it is already a dict) and it is useful to specify imaging modality anyway (so that we can show only compatible images in the GUI). The advantage of having all these data in the model description is that it is easy to show this information on the GUI, you could filter for all the models that operate on MRI input, etc.

However, lower-level details might be better to be kept along with the model. We already have a model.pt weights file, labels.csv label definition, we could have an additional json of yaml file that provides detailed information on all the inputs and outputs.

Do MONAI Auto3DSeg models usually come with a yaml config file that specify the modality for each input? Do you mean that we could/should include that config file (or perhaps a simplified version) in the model packages?

che85 commented 8 months ago

Correct, as currently implemented in the tutorials section, Auto3dSeg requires the .yaml for configuring the AutoRunner.

https://github.com/Project-MONAI/tutorials/blob/3a016a01bd01ff558c0a74e9d1006c99daccd9a3/auto3dseg/tasks/hecktor22/input.yaml#L3-L7

We should probably confirm with @diazandr3s if that's always the case or what the future plans are.

lassoan commented 8 months ago

At the beginning we tried to use the AutoRunner but it was pretty bad. We had to keep modifying the yaml file because the inference inputs were expected to be in the file (not just the configuration). It also had several bugs and inference on CPU was extremely slow (about 10x slower than the current optimized script in the Slicer extension), label definition is inadequate (no terminology), etc.

Overall, the autorunner script was just not good enough and it was not possible to fix while keeping it backward compatible. Probably Andres' plan is to develop a good script and when it is mature and proven then go back to MONAI folks to fix the autorunner script based on that.

I think a good long term solution would be to add a minimal yaml/json file to the model .zip file that describes input names. If we pack the zip file so that this yaml file is added first then it should be enough to download the first few hundred KB of the zip file and get the config file, so we could quickly get these additional details and show in the GUI without having to download 300MB (or have a more sophisticated model store than just a github release, which allows retrieving metadata without downloading a full model).

In the short term, we could add the extra input metadata to the models.json file in the extension and generate a yaml configuration file on-the-fly.

Before implementing anything I would like to hear from @diazandr3s, too.

che85 commented 8 months ago

Most of the settings for training were saved in a dictionary of the model.pt and I would assume that all the necessary information for running inference could be saved there including individual channel normalization strategies, etc.

Example config saved in model.pt of Prostate - Multisequence

``` {'_meta_': {}, 'acc': None, 'amp': True, 'anisotropic_scales': True, 'auto_scale_allowed': True, 'auto_scale_batch': True, 'auto_scale_filters': False, 'auto_scale_roi': False, 'batch_size': 2, 'bundle_root': '/workspace/outputs/segresnet_0', 'cache_class_indices': True, 'cache_rate': None, 'calc_val_loss': False, 'channels_last': True, 'ckpt_path': "$@bundle_root + '/model'", 'ckpt_save': True, 'class_index': None, 'class_names': ['val_acc_pz', 'val_acc_tz'], 'crop_mode': 'rand', 'crop_ratios': None, 'cuda': True, 'data_file_base_dir': '/workspace', 'data_list_file_path': '/workspace/outputs/dataset_pz_tz.json', 'debug': False, 'determ': False, 'early_stopping_fraction': 0.001, 'extra_modalities': {'image2': 'mri'}, 'finetune': {'ckpt_name': "$@bundle_root + '/model/model.pt'", 'enabled': False}, 'fold': 0, 'fork': True, 'global_rank': 0, 'image_size': [320, 320, 20], 'image_size_mm_90': [200.0, 200.0, 72.00019836425781], 'image_size_mm_median': [200.0, 200.0, 71.99999809265137], 'infer': {'ckpt_name': "$@bundle_root + '/model/model.pt'", 'data_list_key': 'testing', 'enabled': False, 'output_path': "$@bundle_root + '/prediction_testing'"}, 'input_channels': 2, 'intensity_bounds': [92.86586182692955, 731.8263739224138], 'learning_rate': 0.0002, 'log_output_file': "$@bundle_root + '/model/training.log'", 'loss': {'_target_': 'DiceCELoss', 'include_background': True, 'sigmoid': False, 'smooth_dr': 1e-05, 'smooth_nr': 0, 'softmax': True, 'squared_pred': True, 'to_onehot_y': True}, 'max_samples_per_class': 12500, 'mgpu': {'global_rank': 0, 'rank': 0, 'world_size': 4}, 'modality': 'mri', 'name': 'prostate', 'network': {'_target_': 'SegResNetDS', 'act': ['relu', {'inplace': False}], 'blocks_down': [1, 2, 2, 4, 4], 'dsdepth': 4, 'in_channels': 2, 'init_filters': 32, 'norm': 'INSTANCE_NVFUSER', 'out_channels': 3, 'resolution': [0.625, 0.625, 3.5999999046325684]}, 'normalize_mode': 'meanstd', 'num_crops_per_image': 1, 'num_epochs': 1250, 'num_epochs_per_saving': 1, 'num_epochs_per_validation': None, 'num_images_per_batch': 1, 'num_steps_per_image': None, 'num_warmup_epochs': 3, 'num_workers': 4, 'optimizer': {'_target_': 'torch.optim.AdamW', 'lr': 0.0002, 'weight_decay': 1e-05}, 'output_classes': 3, 'pretrained_ckpt_name': None, 'quick': False, 'rank': 0, 'resample': False, 'resample_resolution': [0.625, 0.625, 3.5999999046325684], 'roi_size': [320, 320, 20], 'sigmoid': False, 'spacing_lower': [0.6000000238418579, 0.5999997456087116, 2.999998608938887], 'spacing_median': [0.625, 0.625, 3.5999999046325684], 'spacing_upper': [0.75, 0.7500001254625659, 4.0000004302510295], 'start_epoch': 0, 'stop_on_lowacc': True, 'task': 'segmentation', 'validate': {'ckpt_name': "$@bundle_root + '/model/model.pt'", 'enabled': False, 'invert': True, 'output_path': "$@bundle_root + '/prediction_validation'", 'save_mask': False}, 'validate_final_original_res': True} ```

@diazandr3s Please let us know what you think short-term and long-term.

diazandr3s commented 8 months ago

Hi @che85 and @lassoan,

Just came back from GTC and I'm now catching up on emails/messages. This is a great discussion. Thanks for commenting on this.

@che85: The normalization mode meanstd

'normalize_mode': 'meanstd',

Uses this transform and arguments: NormalizeIntensityd(keys="image", nonzero=True, channel_wise=True)

Argument channel_wise=True means normalization happens on each channel separately: https://docs.monai.io/en/stable/transforms.html#normalizeintensityd

Regarding this:

Most of the settings for training were saved in a dictionary of the model.pt and I would assume that all the necessary information for running inference could be saved there including individual channel normalization strategies, etc.

Agree! All the information is there. We could use that to normalize accordingly. Especially for the cases you've mentioned:

(for example image1=MRI, image2=CT)

Currently, I don't see a use case with these two modalities (CT and MR) being used at the same time. We could easily change the inference script to manage those cases once we have a model for this scenario.

BTW, usually, the following argument in the YAML file:

extra_modalities: {image2 : none} # a second modality

is used when the first modality is different than the first one. i.e. first modality CT and second modality PET.

If all the modalities are MR, you should follow this way of specifying multimodality in the JSON file (BRATS example) - No need to modify the YAML:

{
    "training": [
        {
            "fold": 0,
            "image": [
                "GLI/TrainingData/BraTS-GLI-01146-000/BraTS-GLI-01146-000-t2f.nii.gz",
                "GLI/TrainingData/BraTS-GLI-01146-000/BraTS-GLI-01146-000-t1c.nii.gz",
                "GLI/TrainingData/BraTS-GLI-01146-000/BraTS-GLI-01146-000-t1n.nii.gz",
                "GLI/TrainingData/BraTS-GLI-01146-000/BraTS-GLI-01146-000-t2w.nii.gz"
            ],
            "label": "GLI/TrainingData/BraTS-GLI-01146-000/BraTS-GLI-01146-000-seg.nii.gz"
        },
        {
            "fold": 0,
            "image": [
                "GLI/TrainingData/BraTS-GLI-01419-000/BraTS-GLI-01419-000-t2f.nii.gz",
                "GLI/TrainingData/BraTS-GLI-01419-000/BraTS-GLI-01419-000-t1c.nii.gz",
                "GLI/TrainingData/BraTS-GLI-01419-000/BraTS-GLI-01419-000-t1n.nii.gz",
                "GLI/TrainingData/BraTS-GLI-01419-000/BraTS-GLI-01419-000-t2w.nii.gz"
            ],
            "label": "GLI/TrainingData/BraTS-GLI-01419-000/BraTS-GLI-01419-000-seg.nii.gz"
        },

I hope this helps,

che85 commented 8 months ago

@diazandr3s The combination of CT and MR was just an example. It could also be MRI in combination with a label input to guide the training/inference process. If that's the case, the label input (image2) should not be normalized.

Nonetheless, the normalization information could probably be directly saved in the model configuration. If it's one value, but multiple input images, we could use the single value for normalizing all. If it's multiple input images (different modality or input type), then it should be a list corresponding to the number of input channels so that appropriate normalization will take place.

My use case: image=MRI and image2 is a label (normalization:none).

I hardcoded the normalization locally and it works.

@lassoan @diazandr3s What should we do going forward?

diazandr3s commented 8 months ago

Hi @che85,

The combination of CT and MR was just an example. It could also be MRI in combination with a label input to guide the training/inference process. If that's the case, the label input (image2) should not be normalized.

Thanks for clarifying.

Nonetheless, the normalization information could probably be directly saved in the model configuration.

I agree! But this should be solved/added in the Auto3D system so we can consume the config file for inference. This MONAIAuto3D plugin is for inference only. I'd suggest commenting on this directly in the MONAI repo

My use case: image=MRI and image2 is a label (normalization:none).

I hardcoded the normalization locally and it works.

When running inference, how do you see the user providing the label? Is there a way to access that model?

che85 commented 8 months ago

When running inference, how do you see the user providing the label? Is there a way to access that model?

The user will select the input volumes from the module's UI. The input image and input label were loaded as scalar volumes into Slicer and can then be selected from the MONAIAuto3dSeg UI.

The model is not public.

lassoan commented 8 months ago

It is hard to develop and maintain a feature without test data. Do you think you could provide a simple small test model (maybe lower resolution, lower accuracy, but good enough to run reasonably on some specific test data sets)?

che85 commented 8 months ago

I just checked the output model's config again and the extra_modalities key is listed there (https://github.com/lassoan/SlicerMONAIAuto3DSeg/issues/32#issuecomment-2018321802). We might as well use that. The Prostate - Multisequence model can be used for testing.

che85 commented 8 months ago

My only worry with using extra_modalities would be that arbitrary names could be used for the additional modalities and we are looking at a standard Python unordered dictionary.

lassoan / SlicerMONAIAuto3DSeg

Multiple input volumes may need different normalization #32

Example config saved in model.pt of Prostate - Multisequence