poor brain extraction - Githubissues

psadil commented 3 years ago

Possible duplicate of #275, may be related to #554.

I'm working with a large dataset comprising mostly middle-age and older adults. So far, there are around 75 participants. For most of them, the brain masks seem off. Most of the participants show relatively minor issues, but there are a few where

the ventricles seem to be missed brainmask-10040

the eyes are included brainmask-10061

or the mask misses dramatically brainmask

I recognize that we're dealing with data that 3dSkullStrip flags as specifically troublesome

    Massive chunks missing:
If brain has very large ventricles and lots of CSF between gyri, the ventricles will keep attracting the surface inwards. This often happens with older brains.

and also the data are relatively noisy, likely due to movement. But would it be feasible to configure a more robust skull stripping workflow within MRIQC? I'd be motivated to help test solutions or write code.

oesteban commented 3 years ago

Yes, indeed this is some sort of duplicate of #275 and #554.

The flip side of 3dSkullStrip is that it is really fast. Most of the images are accurately segmented in almost no time. Current runtime of MRIQC on a T1w scan with 12 cpus is below 6 min. Almost any more robust alternative for brain extraction may double this time (at the very very least).

A way of going around the problem is integrating some DL technique. I would prefer starting with nobrainer, because I know @satra is working on models towards QC, so we could have both brain mask and brain tissue segmentation at almost no time cost. And the residuals of parcellation could also be added in one way or another to the IQMs list.

Would you like to take on that @psadil?

oesteban commented 3 years ago

Actually, Satra already suggested Kwik https://github.com/nipreps/mriqc/issues/904#issuecomment-793055611

satra commented 3 years ago

we could try to optimize the cpu-only version of kwyk (which is slow at the moment). the gpu version takes 30s. note that kwyk is more than just skull stripping (provides a measure of quality, and brain region segmentation). so it does a fair bit in 30s :)

psadil commented 3 years ago

Hi @oesteban and @satra. These sound like great options. I'd definitely like to take this on. I can try out both nobrainer and kwyk on our data and report back.

oesteban commented 3 years ago

we could try to optimize the cpu-only version of kwyk (which is slow at the moment)

@satra what is slow here?

I'd definitely like to take this on

@psadil that would be awesome - let's start with kwyk

satra commented 3 years ago

what is slow here?

inference on a CPU takes about 20 mins if i remember correctly. it's been a long time since i actually ran it on a CPU :)

@psadil- if you want to try it in the cloud, this notebook uses the kwyk model: https://colab.research.google.com/github/neuronets/nobrainer/blob/master/guide/inference_with_kwyk_model.ipynb

oesteban commented 3 years ago

inference on a CPU takes about 20 mins

I think that would be acceptable - does it parallelize in any ways across several CPUs?

satra commented 3 years ago

does it parallelize in any ways across several CPUs

i believe tensorflow does this automatically, but again i have not tested in a while.

psadil commented 3 years ago

kwyk performs much better. But the segmentation is still off on a few participants. These are the mean_orig label it produced in the third/last participant from above.

screenshot

kwyk also seems to exclude parts of the brain, specifically a chunk around the ventricles.

I ran kwyk with -n 4. The following is the variance map. I guess this is showing that across the 4 samples, variance in the excluded region around the lateral ventricles is high, meaning that some of the samples included that region? Is it common to increase the number of samples further?

variance

Click to show call and console log

``` $ singularity run --home $PWD docker://neuronets/kwyk:latest-cpu -m bvwn_multi_prior -n 4 --save-variance --save-entropy sub-10008_ses-V1_T1w.nii.gz output INFO: Using cached SIF image Bayesian dropout functions have been loaded. Your version: 0.5.0+0.g638edd8.dirty Latest version: 0.5.0 2021-10-01 20:11:49.719619: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA ++ Conforming volume to 1mm^3 voxels and size 256x256x256. /opt/kwyk/freesurfer/bin/mri_convert: line 2: /opt/kwyk/freesurfer/sources.sh: No such file or directory mri_convert.bin --conform sub-10008_ses-V1_T1w.nii.gz /tmp/tmpw6xj57rj.nii.gz $Id: mri_convert.c,v 1.226 2016/02/26 16:15:24 mreuter Exp $ reading from sub-10008_ses-V1_T1w.nii.gz... TR=6.95, TE=0.00, TI=0.00, flip angle=0.00 i_ras = (1, 0, 0) j_ras = (0, 1, 0) k_ras = (0, 0, 1) changing data type from float to uchar (noscale = 0)... MRIchangeType: Building histogram Reslicing using trilinear interpolation writing to /tmp/tmpw6xj57rj.nii.gz... ++ Running forward pass of model. Normalizer being used 3.8537837e-07 1.0000001 64/64 [==============================] - 6804s 106s/step ++ Saving results. /opt/kwyk/freesurfer/bin/mri_convert: line 2: /opt/kwyk/freesurfer/sources.sh: No such file or directory mri_convert.bin -rl sub-10008_ses-V1_T1w.nii.gz -rt nearest -ns 1 output_means.nii.gz output_means_orig.nii.gz $Id: mri_convert.c,v 1.226 2016/02/26 16:15:24 mreuter Exp $ reading from output_means.nii.gz... niiRead(): NIFTI_UNITS_UNKNOWN, assuming mm TR=0.00, TE=0.00, TI=0.00, flip angle=0.00 i_ras = (-1, 0, 0) j_ras = (0, 0, -1) k_ras = (0, 1, 0) reading template info from volume sub-10008_ses-V1_T1w.nii.gz... Reslicing using nearest writing to output_means_orig.nii.gz... /opt/kwyk/freesurfer/bin/mri_convert: line 2: /opt/kwyk/freesurfer/sources.sh: No such file or directory mri_convert.bin -rl sub-10008_ses-V1_T1w.nii.gz output_variance.nii.gz output_variance_orig.nii.gz $Id: mri_convert.c,v 1.226 2016/02/26 16:15:24 mreuter Exp $ reading from output_variance.nii.gz... niiRead(): NIFTI_UNITS_UNKNOWN, assuming mm TR=0.00, TE=0.00, TI=0.00, flip angle=0.00 i_ras = (-1, 0, 0) j_ras = (0, 0, -1) k_ras = (0, 1, 0) reading template info from volume sub-10008_ses-V1_T1w.nii.gz... Reslicing using trilinear interpolation writing to output_variance_orig.nii.gz... /opt/kwyk/freesurfer/bin/mri_convert: line 2: /opt/kwyk/freesurfer/sources.sh: No such file or directory mri_convert.bin -rl sub-10008_ses-V1_T1w.nii.gz output_entropy.nii.gz output_entropy_orig.nii.gz $Id: mri_convert.c,v 1.226 2016/02/26 16:15:24 mreuter Exp $ reading from output_entropy.nii.gz... niiRead(): NIFTI_UNITS_UNKNOWN, assuming mm TR=0.00, TE=0.00, TI=0.00, flip angle=0.00 i_ras = (-1, 0, 0) j_ras = (0, 0, -1) k_ras = (0, 1, 0) reading template info from volume sub-10008_ses-V1_T1w.nii.gz... Reslicing using trilinear interpolation writing to output_ entropy_orig.nii.gz... ```

satra commented 3 years ago

@psadil - thanks for the feedback. indeed you have hit on one of the training issues of kwyk. the public data we used did not have too many brains with large ventricles. this is on our todo list in terms of retraining kwyk. hopefully that will address that scenario.

if these brains happen to be part of a public dataset, could you please let us know?

also just a note that unless you want the variance map, you can get the uncertainty map with just a single run (-n 1), which also provides a measure of scan quality and would cut the time to a quarter of the time.

psadil commented 3 years ago

sorry for the slow follow up.

@satra. Unfortunately, these brains are not a part of a public dataset. You mention that retraining kwyk was on a todo list. Was there a dataset in mind? Thanks for the tip about n 1.

@oesteban fwiw fmriprep is being run on these same participants, so I'll be able to compare the extraction with that setup. As you say, it'd be slower, but if they look at least okay we're considering using those masks as a way to get QC metrics. But I guess you're saying that for mriqc that implementation is too slow?

oesteban commented 2 years ago

I'm closing #275 and #554 in favor of this one.

The rationale is as follows - MRIQC is an assessment tool and is not intended for preprocessing. From this angle, it would even beneficial that the tool generated some bad mask randomly, so long errors do not bias automated classifiers trained on IQMs to make the wrong calls.

That's assuming there's such a tool that works (MRIQC's classifier clearly doesn't).

So, if the masks are to be fixed, that must be an easy solution that is very reliable (i.e., does not underperform in other datasets once it has been proven to work on the failing cases).

kwyk is exactly that - promises very high reliability (and generalization potential, e.g., monkeys) and even a way of generating new features or replacing the classifier with them.

satra commented 2 years ago

thanks to @ahoopes an even better tool has just come online: https://surfer.nmr.mgh.harvard.edu/fswiki/SynthStrip

if you already have freesurfer inside the container, then this will be included in freesurfer. if not you can probably install it. depending on whether the needs are brainmasking, parcellation, uncertainty estimation or something else, there are a bunch of cool tools available :)

romainVala commented 2 years ago

Hi @satra Do you know where SynthStrip comes from (any publication) ?

Robex is also a good candidate https://sites.google.com/site/jeiglesias/ROBEX (still from freesurfer, so may be out of date, with the new deep learning tools)

satra commented 2 years ago

@romainVala - it's an ismrm abstract for this year, but i believe a full paper is in the process of submission.

psadil commented 2 years ago

Oh, SynthStrip looks great. Running it on a few of these participants that had poor extraction results in MRIQC produces much better brain masks (and indeed it only took a minute or so). But it sounds like that's to be expected, given the range of images on which SynthStrip has been evaluated.

romainVala commented 2 years ago

any abstract / article on SynthStrip ? (sorry I can not find it ...)

oesteban commented 2 years ago

@ahoopes, I'd be eager to give it a try - but the docker image is built (only) with python 3.6 bindings. We are currently deploying python 3.8 (Python 3.6 reached its EOL last Dec 23, 2021).

Is there any way of installing SynthStrip without compiling FreeSurfer completely?

ahoopes commented 2 years ago

Glad to hear SynthStrip is working well so far! Though, just a heads up - we're still in the middle of officially "releasing" it, so the wiki page might be going through some changes in the next few days (for example I'm about to change the command name to mri_synthtrip).

I will make a preprint of the paper available in the next week and update the thread when that's the case.

@oesteban aside from downloading FS or using the standalone Docker/Singularity wrappers on the wiki, there is probably not an easy method to use it with your own python at the moment, but the freesurfer python package that uses c-bindings will soon be non-existent as I'm going to upload a more accessible package to pypi. Though, this is still a week or two away, but I can keep you updated with the status.

ahoopes commented 2 years ago

SynthStrip preprint is available on arxiv at https://arxiv.org/abs/2203.09974

romainVala commented 2 years ago

great job, I am convince that synthetic approach are really the way to go to learn a segmentation task (not related but I have recently also seen your work on white matter surface learning : great job too !)

nipreps / mriqc

poor brain extraction #933