Open jcohenadad opened 8 months ago
I agree with these points. It would simplify the next steps. I think binarazing at 0.5 the GT makes sense, that it will represent the average of the 6 contrasts, but not enconding the registration errors
When we trained with nnunet, we actually did binarize the GT and nnunet gave sharper boyndary/ adapted better to the shape of the cord than the MONAI (e.g. for spinal cord compressions). Maybe this will help for this too!
Thank you for discussion these important points and summarizing them here. For points 2-5, I have nothing else to add as I agree with all of them, and they quite succintly describe the issues we've been facing so far! In point 1:
very bottom of the cord which is only covered by the T1w and T2w scans, creating a non-homogeneous ground truth, which could hamper the performance of the model;
This might have already been the case. The current version of the dataset is dominated by images/contrasts with smaller FoV (i.e. T2*, MTon, MToff, DWI) and the model has only seen blurry/oversegmented GT (due to possible mis-registration errors) during training. As a result, its tendency is to output predictions with (relatively) more voxels outside SC for these contrasts compared to T1w and T2w.
As for the solutions, I have one question:
!=
averaged soft GT binarized at 0.5 (which makes sense). Now, if we want to extend contrast-agnostic on, say MP2RAGE, should we mix the soft binarized GT from spine-generic and hard GT of MP2RAGE during (re)training ? Adapting existing ground truths on other datasets (to enrich the contrast agnostic model) would be much easier
This is definitely true. This also means much quicker analyses on the lifelong learning aspect of the model.
Just to have it documented, here's the comparison between: (i) soft output by the model trained directly on soft masks (pred_soft
), and (ii) soft output by the model trained on binarized soft averaged GT as inputs (pred_soft_bin
). Note that both are SoftSeg models but on which GT masks the data augmentation is applied changes -- for (i) the transformations are applied directly on soft averaged masks and for (ii) they are applied on binarized soft averaged masks.
This gif here was discussed in one of our meetings, showing that the model trained with binarized soft average GT is better at estimating the partial volume at the boundary for the T2star image. Note the size of the ring of soft values decreasing for the pred_soft_bin
image.
Context
I reflected about the excellent discussion we had yesterday, and I came to the conclusion that keeping a soft mask after doing the averaging across all contrasts is quite problematic for several reasons:
Solution
For all these reasons, I am wondering if binarizing (with 0.5 threshold) the ground truth after averaging would solve many of our problems:
Related to: