Closed jcohenadad closed 4 months ago
Once the appropriate kernel is found, apply it to the GT, and call the new GT with suffix, eg: sub-XXX_T1w_label-SC_seg-soft.nii.gz. Or sub-XXX_T1w_label-SC_probseg.nii.gz (although I find the latest one less intuitive, maybe we should revisit our convention @valosekj @sandrinebedard)
Agree. sub-XXX_T1w_label-SC_seg-soft.nii.gz
is more intuitive than sub-XXX_T1w_label-SC_probseg.nii.gz
.
Another possible variants:
sub-XXX_T1w_label-SC_segsoft.nii.gz
(i.e., without the last -
)sub-XXX_T1w_label-SC_softseg.nii.gz
i like sub-XXX_T1w_label-SC_softseg.nii.gz
. I think it makes it more BIDS compliant by not having the hyphen between soft and seg
i like
sub-XXX_T1w_label-SC_softseg.nii.gz
.
Me too!
I think it makes it more BIDS compliant by not having the hyphen between soft and seg
Exactly!
Let's wait for @sandrinebedard opinion. I will then update our convention.
also tagging @mguaypaq who is versed into bids
I agree with sub-XXX_T1w_label-SC_softseg.nii.gz
, it also coherent with https://github.com/spine-generic/data-multi-subject/blob/master/derivatives/labels_softseg/sub-amu01/anat/sub-amu01_T1w_softseg.json (which the average softseg created from spine generic)
Some thoughts:
_probseg.nii.gz
is what's currently officially in BIDS (reference), but I don't particularly mind _softseg.nii.gz
as a suffix. The standard already explicitly allows non-compliant derivatives.Either way, the biggest problem for BIDS compliance is actually the _T1w_
part of the filename, because it's not a _key-value_
pair. Normally a filename is only allowed one _suffix
part, at the end, which is _softseg
or _probseg
in this case. The BIDS recommended way to keep the original file _suffix
as part of a derivative file name is with a _desc-suffix_
entity (see the next-to-last point here and some of the filename schemas at this link; I have an open pull request to add _desc-suffix_
to the other derivative filename schemas.)
But the major downside for following this convention is that it's much harder to deal with the conversion between _T1w
and _desc-T1w
in a simple bash script.
I guess we concluded to change _probseg.nii.gz
to _softseg.nii.gz
. I did this in https://github.com/neuropoly/intranet.neuro.polymtl.ca/commit/03ca9af8753dd11795627f242350b380f4e5b890.
Mathieu's point about _desc-suffix_
is relevant and should be kept in mind.
This is a slice where of the sub-MRS00
, that due to the presence of a MS lesion, the contrast-agcostic (C.A.) model has an important subsegmentation, where even if we apply a dilation in this soft masks, we do not get to have a correct soft segmentation of the SC (image 2 - 4).
So, taking the binary GT and doing a dilation with a fixed kernel (see Notebook ), we can obtain a soft mask that keeps the CSA measure (image 5 - 6).
Performing this procedure on all slices, we observe a preserved CSA between GT bin and GT soft
Here is the QC sub-MRS001
I will continue to investigate the entire database to see if there is a significant difference between the CSA from GT bin and soft. (for this subject the maximum difference was 0.009 mm2 ).
This is excellent @Nilser3 ! I think it's the way to go
As previously reported,
Here the QC for 12 subjects from marseille-3T-mp2rage
with SC improved and soft masks.
We have two params: the sigma of the gaussian kernel (that creates the softness of the GT), and the dilation/erosion to apply (to match the CSA of the contrast-agnostic model). I suggest we fix the first one (ie: sigma), and then find the appropriate dilation/erosion to match the CSA.
Question: how do we find the appropriate sigma?
Adding my two cents here -- recently, I was trying to play with dilation/erosion and gaussian kernel by applying them to the lesion masks in the context of ms lesion augmentation. I used this approach and saw that softness preservation was okay.
What I learned is that it was easier to fix binary dilation to the default values given in scipy and tweak the sigma value to match the softness we want.
What I learned is that it was easier to fix binary dilation to the default values given in scipy and tweak the sigma value to match the softness we want.
Thank you for chipping in. What limitation of that approach (ie: binary morphomath followed by smoothing) is when the GT itself is already smooth. We currently don't have to deal with that, though, but we might in the future (eg: if we need to re-calibrate our GT).
What I learned is that it was easier to fix binary dilation to the default values given in scipy and tweak the sigma value to match the softness we want.
What makes it easier?
What limitation of that approach (ie: binary morphomath followed by smoothing) is when the GT itself is already smooth.
Ah right! one important note in my approach is that the mask was not soft. so what you described is indeed a limitation
What makes it easier?
oh it's just one less hyperparameter to think about (i.e. the structuring element for the dilation)
oh it's just one less hyperparameter to think about (i.e. the structuring element for the dilation)
ok, so you first calibrate, and then you smooth? so there is a risk that after smoothing, the CSA is changed
ok, so you first calibrate, and then you smooth? so there is a risk that after smoothing, the CSA is changed
In the context of the problem discussed in this issue -- Yes. But, my experiments were not concerning CSA at all. They were for simply smoothing the lesion (ie. preserving PVE) after a lesion has been copied from a patient to a healthy subject. And I also had to use binarized GT for (nnUNet) training -- the dilation and smooth were only to preserve PVE not to create soft masks. With my initial comment, I just wanted to refer to some code to provide a starting point/direction, sorry if it is going off topic already 😅
GOAL: find the appropriate smoothing kernel
Thanks you @jcohenadad
Ok, following this summary, I have applied the convolve2d
function from scipy.signal in the hard-seg
masks (in 2D axial slices).
Using a kernel as:
kernel = np.array([[factor_c , factor_a , factor_c ],
[factor_a , factor_b , factor_a ],
[factor_c , factor_a , factor_c ]])
here the minimal MSE scores (between soft-seg
and hard-seg-smooth
)
The minimumal MSE was: 1.620882622212772e-05 , and MI : 1.7332167484902985 (MSE and MI calculated in 3D masks) for a factor_a = 25, factor_b = 0 and factor_c = 39
(Note that the result of the convolution is hard seg smooth all
, but our final result is the multiplication with hard seg
nice!
kernel_3D = np.array([[[factor_c , factor_a , factor_c ],
[factor_a , factor_c , factor_a ],
[factor_c , factor_a , factor_c ]],
[[factor_a , factor_c , factor_a ],
[factor_c , factor_b , factor_c ],
[factor_a , factor_c , factor_a ]],
[[factor_c , factor_a , factor_c ],
[factor_a , factor_c , factor_a ],
[factor_c , factor_a , factor_c ]] ])
I explored different factors, and with this curve I obtained the factors that made the minimum MSE with 3D kernels
Indeed, I have a better MSE with factor_a
= 21, factor_b
= 0 and factor_c
= 0
MSE = 1.5467695382768477e-05 (lower MSE than 2D)
I have resampled the native image resolution from 1mm isotropic to 0.5 isotropic, So the kernels I used previously didn't work well: MSE: 0.007665092145705405
so I started looking for 5x5x5 kernels.
MSE: 0.002629207418971108
I continue exploring other factors/methods to be able to answer all the previous questions
very nice investigations @Nilser3 ! few comments:
Just a thought of a wild idea (need to flush out the details later)
Instead of us trying to find an appropriate smoothing kernel to go from hard --> soft mask, what if we can train a DL model to do that for us? Pros: (1) learning kernels are what DL models are very good at, so we'd rather outsource it through a model (2) we have good quality, manually corrected binary labels (so data size is not a problem).
Model inputs: binary labels Model outputs: soft labels (yes, we don't even need the "images" !) Constraint: the SC CSA from corrected (input) binary label should be preserved by the (output) soft label. If we design a loss function that takes care of this, then the model will automatically learn that the CSA b/w hard and soft seg must be preserved. Earlier attempt to optimize for CSA during training (this can be a good starting point for this idea)
I might be missing some obvious things, any suggestions are welcome!
EDIT: this is also assuming that we're not using contrast-agnostic model predictions anywhere (i.e. the dataset/contrasts for which we want to improve contrast-agnostic model on, already has QC'd binary labels)
Instead of us trying to find an appropriate smoothing kernel to go from hard --> soft mask, what if we can train a DL model to do that for us? Pros: (1) learning kernels are what DL models are very good at, so we'd rather outsource it through a model (2) we have good quality, manually corrected binary labels (so data size is not a problem).
This is an interesting approach. My only concern is that we can assess CNNs performance based on a certain data distribution and test set. What if we attach to much 'trust' to the produced CNNs, and one day we blindly apply it a binary segmentations which produce wrong soft ones (eg: because the input resolution is drastically different). A smoothing kernel is less 'opaque' in terms of interpretability. That being said, I'm open to this idea, but I need to be convinced it works as expected in many different conditions.
Thank you for your comments @jcohenadad, @naga-karthik
Continuing my explorations using kernels, I propose:
hard-seg
masks, to 0.1mm only in the axial plane (using sct_resample
)hard_2_soft
I have this script for this propose, here some results in different modalities (resolutions):
hard-seg
: Green line in outline
MSE
: 0.00285
hard-seg
: Green line in outline
MSE
: 0.000772
hard-seg
: Green line in outline
MSE
: 0.00506
hard-seg
: Green line in outline
MSE
: 0.00017
Note:
soft-seg
from hard,-seg
but my results were no better than these classical approaches. @Nilser3 hold on for now (see https://github.com/sct-pipeline/contrast-agnostic-softseg-spinalcord/issues/99)
Closing this issue as we have identified other ways enriching the contrast agnostic model. Summary of key points:
sct_deepseg_sc 2D
which shows biased behaviours (in terms of the CSA) across
There are a few projects where binary ground truths of good quality already exist. They have been reviewed by a human and are reliable to use for training. However, given that the original mask was created using
sct_deepseg_sc
, there is an over/under segmentation. Moreover, the mask is binary and we'd rather enrich the contrast-agnostic model using soft mask, in order to avoid reducing the softness of the model prediction (@naga-karthik observed it in previous experiments).One possible strategy, is to:
sub-XXX_T1w_label-SC_seg-soft.nii.gz
. Orsub-XXX_T1w_label-SC_probseg.nii.gz
(although I find the latest one less intuitive, maybe we should revisit our convention @valosekj @sandrinebedard)