ivadomed / ivadomed

Repository on the collaborative IVADO medical imaging project between the Mila and NeuroPoly labs.
https://ivadomed.org
MIT License
154 stars 149 forks source link

Discrepancy between pytorch model and onnx model #882

Open hermancollin opened 3 years ago

hermancollin commented 3 years ago

Documenting this weird bug I ran into while working on microscopy files.

During the training/validation/testing steps, I get good results and predictions are visually OK. However, when trying to segment these images using the --segment command, the results are consistently bad. To make sure the --segment command was faulty, I tried segmenting an image using the --test command and a dummy gt. The results were fine so the problem is definitely coming from --segment.

To illustrate this, here is an output myelin segmentation using --test:

And here is the output of the same sample using --segment:

This is such a weird behavior, because as we can see, the --segment command does not produce complete rubbish. It seems the model identifies the right elements, produces a segmentation and then outputs the edges of the segmentation.

Moreover, I am not able to pinpoint how to reproduce this. @mariehbourget trained models on other microscopy datasets and she doesn't encounter anything like this. When I use her recently trained models, everything works as expected. This would suggest that something went wrong when I trained my models but my config files look almost identical to the ones she used.

Yesterday, I trained 2 new models on an up-to-date master branch (some PRs were merged in the meantime) and nothing changed.

hermancollin commented 3 years ago

Here is a comparison of a configuration file that does trigger the problem vs one that doesn't:

config_microscopy_ok.json ```json { "command": "train", "gpu_ids": [0], "path_output": "log_microscopy_sem", "model_name": "model_seg_rat_axon-myelin_sem", "debugging": true, "object_detection_params": { "object_detection_path": null, "safety_factor": [1.0, 1.0, 1.0] }, "loader_parameters": { "path_data": ["../data_example_microscopy_sem"], "bids_config": "ivadomed/config/config_bids.json", "subject_selection": {"n": [], "metadata": [], "value": []}, "target_suffix": ["_seg-axon-manual", "_seg-myelin-manual"], "extensions": [".png"], "roi_params": { "suffix": null, "slice_filter_roi": null }, "contrast_params": { "training_validation": ["SEM"], "testing": ["SEM"], "balance": {} }, "slice_filter_params": { "filter_empty_mask": false, "filter_empty_input": true }, "slice_axis": "axial", "multichannel": false, "soft_gt": false }, "split_dataset": { "fname_split": "ivadomed/config/20210730_tests/split_datasets.joblib", "random_seed": 6, "split_method" : "sample_id", "data_testing": {"data_type": "sample_id", "data_value":["sample-data15"]}, "balance": null, "train_fraction": 0.7, "test_fraction": 0.1 }, "training_parameters": { "batch_size": 4, "loss": { "name": "DiceLoss" }, "training_time": { "num_epochs": 200, "early_stopping_patience": 50, "early_stopping_epsilon": 0.001 }, "scheduler": { "initial_lr": 0.001, "lr_scheduler": { "name": "CosineAnnealingLR", "base_lr": 1e-5, "max_lr": 1e-2 } }, "balance_samples": { "applied": false, "type": "gt" }, "mixup_alpha": null, "transfer_learning": { "retrain_model": null, "retrain_fraction": 1.0, "reset": true } }, "default_model": { "name": "Unet", "dropout_rate": 0.3, "bn_momentum": 0.1, "final_activation": "sigmoid", "depth": 4, "length_2D": [512, 512], "stride_2D": [500, 500] }, "postprocessing": { "binarize_maxpooling": {} }, "transformation": { "Resample": { "wspace": 0.0001, "hspace": 0.0001 }, "RandomAffine": { "degrees": 5, "scale": [0.1, 0.1], "translate": [0.03, 0.03], "applied_to": ["im", "gt"], "dataset_type": ["training"] }, "ElasticTransform": { "alpha_range": [28.0, 30.0], "sigma_range": [3.5, 4.5], "p": 0.1, "applied_to": ["im", "gt"], "dataset_type": ["training"] }, "NormalizeInstance": {"applied_to": ["im"]} } } ```
config_microscopy_not_ok.json ```json { "command": "train", "gpu_ids": [5], "path_output": "output/", "model_name": "model_seg_human_axon-myelin_bf", "debugging": true, "object_detection_params": { "object_detection_path": null, "safety_factor": [1.0, 1.0, 1.0] }, "loader_parameters": { "path_data": ["../../wakehealth_cropping_pipeline/"], "bids_config": "../../config_bids.json", "subject_selection": {"n": [], "metadata": [], "value": []}, "target_suffix": ["_seg-axon-manual", "_seg-myelin-manual"], "extensions": [".png", ".tif"], "roi_params": { "suffix": null, "slice_filter_roi": null }, "contrast_params": { "training_validation": ["BF"], "testing": ["BF"], "balance": {} }, "slice_filter_params": { "filter_empty_mask": false, "filter_empty_input": true }, "slice_axis": "axial", "multichannel": false, "soft_gt": false }, "split_dataset": { "fname_split": "../../split_datasets.joblib", "random_seed": 6, "split_method" : "sample_id", "data_testing": {"data_type": null, "data_value":[]}, "balance": null, "train_fraction": 0.7, "test_fraction": 0.1 }, "training_parameters": { "batch_size": 2, "loss": { "name": "MultiClassDiceLoss" }, "training_time": { "num_epochs": 400, "early_stopping_patience": 200, "early_stopping_epsilon": 0.0001 }, "scheduler": { "initial_lr": 0.002, "lr_scheduler": { "name": "CyclicLR" } }, "balance_samples": { "applied": false, "type": "gt" }, "mixup_alpha": null, "transfer_learning": { "retrain_model": null, "retrain_fraction": 1.0, "reset": true } }, "default_model": { "name": "Unet", "dropout_rate": 0.25, "bn_momentum": 0.3, "final_activation": "softmax", "depth": 3, "length_2D": [512, 512], "stride_2D": [480, 480] }, "postprocessing": { "binarize_maxpooling": {} }, "transformation": { "RandomAffine": { "degrees": 5, "scale": [0.1, 0.1], "translate": [0.03, 0.03], "applied_to": ["im", "gt"], "dataset_type": ["training"] }, "ElasticTransform": { "alpha_range": [28.0, 30.0], "sigma_range": [3.5, 4.5], "p": 0.1, "applied_to": ["im", "gt"], "dataset_type": ["training"] }, "NormalizeInstance": {"applied_to": ["im"]} } } ```

Here is the output of diff. It seems the problem comes from one or more of these parameters.

diff ```bash » diff config_microscopy_ok.json config_microscopy_not_ok.json 3,5c3,5 < "gpu_ids": [0], < "path_output": "log_microscopy_sem", < "model_name": "model_seg_rat_axon-myelin_sem", --- > "gpu_ids": [5], > "path_output": "output/", > "model_name": "model_seg_human_axon-myelin_bf", 12,13c12,13 < "path_data": ["../data_example_microscopy_sem"], < "bids_config": "ivadomed/config/config_bids.json", --- > "path_data": ["../../wakehealth_cropping_pipeline/"], > "bids_config": "../../config_bids.json", 16c16 < "extensions": [".png"], --- > "extensions": [".png", ".tif"], 22,23c22,23 < "training_validation": ["SEM"], < "testing": ["SEM"], --- > "training_validation": ["BF"], > "testing": ["BF"], 35c35 < "fname_split": "ivadomed/config/20210730_tests/split_datasets.joblib", --- > "fname_split": "../../split_datasets.joblib", 38c38 < "data_testing": {"data_type": "sample_id", "data_value":["sample-data15"]}, --- > "data_testing": {"data_type": null, "data_value":[]}, 44c44 < "batch_size": 4, --- > "batch_size": 2, 46c46 < "name": "DiceLoss" --- > "name": "MultiClassDiceLoss" 49,51c49,51 < "num_epochs": 200, < "early_stopping_patience": 50, < "early_stopping_epsilon": 0.001 --- > "num_epochs": 400, > "early_stopping_patience": 200, > "early_stopping_epsilon": 0.0001 54c54 < "initial_lr": 0.001, --- > "initial_lr": 0.002, 56,58c56 < "name": "CosineAnnealingLR", < "base_lr": 1e-5, < "max_lr": 1e-2 --- > "name": "CyclicLR" 74,77c72,75 < "dropout_rate": 0.3, < "bn_momentum": 0.1, < "final_activation": "sigmoid", < "depth": 4, --- > "dropout_rate": 0.25, > "bn_momentum": 0.3, > "final_activation": "softmax", > "depth": 3, 79c77 < "stride_2D": [500, 500] --- > "stride_2D": [480, 480] 85,89d82 < "Resample": < { < "wspace": 0.0001, < "hspace": 0.0001 < }, 106a100 > ```