HealthML / self-supervised-3d-tasks

Apache License 2.0
187 stars 40 forks source link

Failed to reproduce CPC pancreas finetuning : encountered NaN-Loss in function #28

Open eleyine opened 2 years ago

eleyine commented 2 years ago

Hi,

Thank you for publishing your work. I'm trying to reproduce it in order to apply it to another dataset. However, I'm having trouble reproducing the downstream segmentation task on the 2D pancreas dataset using the CPC algorithm.

I correctly formatted the data to 128 x 128 2D arrays, succesfully trained the model with the CPC self-supervised task, obtained around 90% accuracy and saved the model checkpoint.

When running finetune.py, I keep on getting a NaN loss exception (thrown here). Adding clipnorm=1 and clipvalue=1 to the config file didn't help.

For reference, I used the following config :

{
  "algorithm": "cpc",
  "data_dir_train": "/data/images_slices_128_labeled_single_finetune/train/img",
  "data_dir_test": "/data/images_slices_128_labeled_single_finetune/test/img",
  "model_checkpoint":"/netstore/workspace/cpc_pancreas2d_1/weights-150.hdf5",
  "dataset_name": "pancreas2d",

  "data_is_3D": false,
  "val_split": 0.05,

  "code_size": 1024,
  "patches_per_side": 5,
  "data_dim": 128,

  "loss": "weighted_dice_loss",
  "scores": ["dice", "jaccard", "dice_pancreas_0", "dice_pancreas_1", "dice_pancreas_2"],
  "metrics": ["accuracy", "weighted_dice_coefficient", "weighted_dice_coefficient_per_class_pancreas"],

  "top_architecture": "big_fully",
  "prediction_architecture": "unet_2d_upconv",
  "number_channels": 1,
  "batch_size": 64,

  "exp_splits": [10, 50, 100],
  "lr": 1e-3,
  "epochs_initialized": 400,
  "epochs_frozen": 0,
  "epochs_random": 0,
  "epochs_warmup": 5,
  "repetitions": 3
}

Any help is appreciated. Thanks in advance!