Issues with training using own segmentation data as well as FreeSurfer outputs

Dorrien89 commented 2 years ago

Question/Support Request

Hello! I’m currently trying to train FastSurfer using images from the Hammers atlas database (30 MR images with corresponding segmentation files). I tried training using the Hammers atlas segmentation files as ground truth but the result was bad. I then tried following your training instructions and used the outputs from FreeSurfer as the ground truth and the result, although better, was still lacking.

Any idea what I’m doing wrong?

Kind regards, Carl

Screenshots

Running eval.py on a test image gives the following results:

The first image was using Hammers atlas segmentations and the second is using FreeSurfer outputs. The purple is either right or left inferior lateral ventricle.

Environment

FastSurfer Version: 1.1.2
FreeSurfer Version: 7.2.0
OS: Rocky Linux 8 Everything is being ran using a virtual machine in a cluster.

Execution

Example of generate_hdf5.py command: python3 FastSurferCNN/generate_hdf5.py --hdf5_name /cephyr/NOBACKUP/groups/braintumorquant/carl/exp/training_data_fast/trn_set_cispa_axial.hdf5 --csv_file /cephyr/NOBACKUP/groups/braintumorquant/carl/exp/csv_trn_fast.csv --plane axial --image_name mri/orig.mgz --gt_name mri/aparc.DKTatlas+aseg.mgz --gt_nocc mri/aseg.auto_noCCseg.mgz

Example of train.py command: python3 train.py --hdf5_name_train /cephyr/NOBACKUP/groups/braintumorquant/carl/exp/training_data_fast/trn_set_cispa_sagittal.hdf5 --hdf5_name_val /cephyr/NOBACKUP/groups/braintumorquant/carl/exp/training_data_fast/val_set_cispa_sagittal.hdf5 --plane sagittal --log_dir ../checkpoints/Sagittal_Competitive_APARC_ASEG/ --epochs 30 --num_channels 7 --num_classes 51 --batch_size 16 --validation_batch_size 14

Log files using FreeSurfer outputs: generate_hdf5_log.txt eval_log.txt training_log.txt

LeHenschel commented 2 years ago

I would need a bit more information, but I think somethings are not correct in the prediction call at least. The hdf5-generation and the networks training seem ok (loss is decreasing and mIOUs look ok for most classes), so I would assume a better result than what you get here.

Is the left-hemi really set to 0 in the prediction after retraining or just not covered in the look-up table? Overall, what classes are you predicting and which do you want to predict?
How do the plotted results for the validation set look like during training (I think these should be written out every two epochs into args.logdir/logs)?
Can you show a picture or point me to the url to download an example intensity image (is the Hammer atlas database open access?)?
Does the Hammer atlas contain the same segmentations as the DKTatlas? If not you would need to adjust the generate_hdf5.py part because a few functions are hardcoded for the DKTatlas classes (at least in the v1.x versions of FastSurfer). You can also see that in the log-file you provided (listed classes = DKTatlas).
Can you share the command used to generate both images?

soundray commented 2 years ago

Thanks for the prompt reply! I'm working with Carl on this issue, and I'll answer Point 3: The Hammers Atlas Database is available at brain-development.org (Atlases - Adult Brain Atlases), free for academic use. The publicly available T1-weighted images are defaced, but we were using the full images.

ad 4.: The native segmentation labels are profoundly different from the DKTatlas. To disentangle this issue from other issues we have, Carl tried to train Fastsurfer on Hammers MR images with Freesurfer labels (ignoring the native labels). The second image in his post is sample output from that model.

Dorrien89 commented 2 years ago

1 & 4. I'm not looking to predict any classes in specific, just wanted to see if I could train FastSurfer using the Hammers atlas database. They do not share the same segmentations which explains the results.

I am not 100% sure which files you are referring to but if it's the validation predictions and dice CM in checkpoints/Axial_Competitive_APARC_ASEG/logs/ I attached those from Epoch 30 (from the training with the FreeSurfer outputs).
The Hammers atlases can be found at: https://brain-development.org/brain-atlases/. The atlas used is not listed there however.
The command I used was the same for both images, only changing the paths for the checkpoints and output directory: python FastSurferCNN/eval.py --i_dir /cephyr/NOBACKUP/groups/braintumorquant/carl/exp/content2/orig/orig/ --o_dir /cephyr/NOBACKUP/groups/braintumorquant/carl/exp/content2/orig/FastSurfer-stable_trained_fast/ --network_sagittal_path checkpoints/Sagittal_Competitive_APARC_ASEG/ckpts/Epoch_30_training_state.pkl --network_coronal_path checkpoints/Coronal_Competitive_APARC_ASEG/ckpts/Epoch_30_training_state.pkl --network_axial_path checkpoints/Axial_Competitive_APARC_ASEG/ckpts/Epoch_30_training_state.pkl Epoch_30_Validation_Dice_CM.pdf Epoch_30_Validations_Predictions.pdf

LeHenschel commented 2 years ago

Ok, great. Thanks a lot for the infos. I will see if I can download an example image to see what is going on. The newly trained networks at least should give better results than what you see from eval.py. If you look at the pdf-output, all classes are present and both hemispheres are correctly predicted (minus the background which for some reason gets assigned to a wrong class). Looks to me like there might be some switch in the views (view aggregation fails). Can you test what happens if you only predict with one view?

Dorrien89 commented 2 years ago

I'm afraid I'm unsure exactly what you mean, how do I predict with one view?

I re-ran eval.py using the trained checkpoints for one slice direction at a time with the others as default and all three times the images looked normal.

LeHenschel commented 2 years ago

I meant creating a prediction with only the axial checkpoint, only the coronal or only the sagittal (= no view aggregation). You should be able to do that with just commenting out the other views in the fastsurfercnn function of eval.py.

I re-ran eval.py using the trained checkpoints for one slice direction at a time with the others as default and all three times the images looked normal.

So it works fine, if you use your training checkpoints for one view and the original fastsurfer ones for the other two? That is a bit odd... I would have thought that there is some kind of error in the view aggregation/mix up in the hdf5-files which leads to wrong orientation of the input image (=different from the training step). But in this case, you should not get a correct image with the default old ckpts + one of your new ones.

Dorrien89 commented 2 years ago

Ah, thank you. I followed your instructions and this is the result for each plane:

Axial:

Coronal:

Sagittal:

LeHenschel commented 2 years ago

Is this with your retrained checkpoints?

Dorrien89 commented 2 years ago

Yes, the ones trained with my FreeSurfer outputs.

dkuegler commented 2 years ago

Hi @Dorrien89,

can you please confirm the size of your dataset? It seems to me the dataset is 30 subjects and you are (or are not) splitting this dataset. To get a valid method, we recommend a three-fold split into training, validation and test sets. https://towardsdatascience.com/how-to-split-data-into-three-sets-train-validation-and-test-and-why-e50d22d3e54c

Also, are those screenshots of a seen or unseen image?

Cheers, David

LeHenschel commented 2 years ago

Good point. It would be good to at least test with a subject from the validation set, as these seemed to generate good results during training (based on the mIOU values and the plots you provided in Epoch_30_Validations_Predictions.pdf) or a training case as well.

Dorrien89 commented 2 years ago

Hello @dkuegler!

Yes, my dataset is comprised of 30 subjects, I used 16 for training and 14 for validation in this specific example. I also tried varying the training and validation sizes but I got similar results. The image I've been using for evaluation is not a part of the Hammers atlas database but one I downloaded from the FreeSurfer tutorial wiki.

I'm afraid I don't know what you mean by seen or unseen images.

@LeHenschel I ran eval.py on one of the validation images. The segmentation looks better but I still get labelling of the background:

Thank you for all of your replies, I really appreciate it!

LeHenschel commented 2 years ago

Ok, so the reason the other FreeSurfer tutorial image fails is probably because it is too different from the training and validation set. The Hammer atlas images seem to be very specific.

Overall, I would suggest to a.) train with augmentations or increase your training corpus if you want to generalize better, b.) train the network a bit longer to see if the background issue disappears. It might also be worth testing a different optimizer/learning rate scheduler combination. In the newest FastSurfer version we use adamW + cosineAnnealingWarmRestarts which overall gave a better performance than adam + StepLR.

LeHenschel commented 1 year ago

So I just ran some of the subjects from the Hammers atlas and it works fine with FastSurfer (CNN and VINN). The 16 subjects you are using for training are probably not enough for the network to learn a good generalization. Both points mentioned above might help in improving the results (example subject a01 and a02).

a01

a02

Dorrien89 commented 1 year ago

Thank you for the suggestions. I'll look into getting additional images for training and validation to see if that improves my results.

I have a question regarding the optimizer and scheduler, are adamW and cosineAnnealingWarmRestarts available in the official FastSurfer release? I'm using the latest version and I looked through the train.py code but couldn't find any mention of them.

LeHenschel commented 1 year ago

It is available in the VINN branch (see https://github.com/Deep-MI/FastSurfer/blob/feature/vinn/FastSurferCNN/models/optimizer.py#L30 for adamW and https://github.com/Deep-MI/FastSurfer/blob/feature/vinn/FastSurferCNN/utils/lr_scheduler.py#L28 for cosineAnnealingWarmRestarts). You can either use that branch or copy the code fragments into train.py.

LeHenschel commented 1 year ago

I am closing this issue for now. Feel free to reopen it, when the suggested workarounds do not give a satisfactory result or other questions come up.

Deep-MI / FastSurfer