MIC-DKFZ / nnUNet

Apache License 2.0
5.84k stars 1.75k forks source link

Pretraining shows no progress #1659

Closed Chris-N-K closed 9 months ago

Chris-N-K commented 1 year ago

Hi, I try to use the pretraining and finetuning workflow. I followed the example and did the pretraining step. During pretraining the training and validation losses improve for like 30 epochs and then reach stable values. The pseudodice behaves similar as it increases for 30 epochs and then falls to zero for all classes. Every things stays that way for the remaining epochs till 1000.

I expected to see some training progress. Thus, I get the feeling this should not be the case. Is my guess correct or is every thing fine and I should proceed with the finetuning?

Best Chris.

seziegler commented 1 year ago

Hi @Chris-N-K , this seems odd, can you please share your progress.png that is created during training and also the exact commands that you used for preprocessing, plan transfer, training etc?

Best, Sebastian

Chris-N-K commented 1 year ago

Thanks for the fast reply. The commands were:

nnUNetv2_plan_and_preprocess -d my_DS_I_want_to_finetune_on
nnUNetv2_extract_fingerprint -d my_DS_I_want_to_pretrain_on
nnUNetv2_move_plans_between_datasets -s my_DS_I_want_to_finetune_on -t my_DS_I_want_to_pretrain_on -sp nnUNetPlanns -tp DSxxx_pretraining_plans
nnUNetv2_preprocess -d my_DS_I_want_to_pretrain_on -plans_name DSxxx_pretraining_plans
nnUNetv2_train DSxxx_pretraining_plans 3d_fullres all -p DSxxx_pretraining_plans

I changed the plans name in the transferred plans file after preprocessing, as otherwise the training overwrites the original output of training on my_DS_I_want_to_pretrain_on. The behavior is the same if I don't do this.

I tested it with two DS for finetuning. The first contained images of smaller FOV than the pretraining DS, the second was said images resampled to the same FOV and spacing.

progress progress

seziegler commented 1 year ago

Thanks! The commands seem right. How do the two source and target datasets differ? Are they similar modalities? Can you please also share the dataset.json of the source and target datasets? Another thing you can try is to use the nnUNetTrainerDiceCELoss_noSmooth instead of the default trainer. This seemed to help in another issue #812 with a similar problem but without pretraining.

Chris-N-K commented 1 year ago

The two datasets are of slighty different modality. They both comprise of T2 weighted MR-images but from different sequences. The target dataset images are summations of different echo times. Thus, they look quite similar in general but have quite the different intensity ranges (as the target dataset images are sums).

    "channel_names": {
        "0": "T2w SPACE"
    },
    "labels": {
        "background": 0,
        "S1l": 1,
        "S1r": 2,
        "L5l": 3,
        "L5r": 4
    },
    "numTraining": 177,
    "file_ending": ".nii.gz",
    "name": "MixedDRG",
    "reference": "",
    "release": "",
    "licence": "",
    "description": ""
}
{
    "channel_names": {
        "0": "Synthetic Anatomical Refference"
    },
    "labels": {
        "background": 0,
        "S1l": 1,
        "S1r": 2,
        "L5l": 3,
        "L5r": 4
    },
    "numTraining": 30,
    "file_ending": ".nii.gz",
    "name": "MixedDRG-T2MapAnaRef",
    "reference": "",
    "release": "",
    "licence": "",
    "description": ""
}
seziegler commented 1 year ago

Ok different intensities are a problem here since they are also transferred and the normalization will not work well on the pretraining data. Since the modalities are quite similar but one is just a sum, is it possible to divide the target dataset by the number of different echo times to obtain the data with original intensity again? Then you could redo the planning and transferring and hopefully see better results during pretraining.

Chris-N-K commented 1 year ago

That's a good tip and sounds absolutely plausible. Will do this tomorrow and message back. Thanks :)

Chris-N-K commented 1 year ago

Sadly it did not do the trick. Guess the images are still too different. What application was intended with the pretraining finetuning workflow?

seziegler commented 1 year ago

The workflow was intended for exactly these kind of use cases.

Is it still the exact same behavior at epoch 30 or did something change? Have you tried the nnUNetTrainerDiceCELoss_noSmooth trainer? When you redo the planning I think you need to delete the old plans first because otherwise nnunet will just take the old plans and not compute anything new. Have you done that?

Chris-N-K commented 1 year ago

No I did not try the noSmooth trainer yet, maybe I have time for that today. The intensity ranges of the images are still quite different to a factor 10 in some images. In general the target dataset hast big differences in the intensity ranges between images. Hm... I removed the plans and preprocessed files from the source datasets but maybe I did not remove the plans file from the target dataset. Will test this.

Chris-N-K commented 1 year ago

@seziegler seems like nnUNetTrainerDiceCELoss_noSmooth does the trick. Any idea why this is the case? I don't directly see why the smoothing should compromise the training process.

seziegler commented 1 year ago

Good to hear that it works! As mentioned in #812 there might be a problem with the implementation of the smoothing term. Even though it has changed in nnunetv2 there might still be something wrong. I'll bring it to Fabians attention. Best, Sebastian