DeepRegNet / Benchmark

Repositories for benchmarks with DeepReg and other methods.
Apache License 2.0
3 stars 0 forks source link

Balakrishnan2019TMI - combining weak supervision #8

Open YipengHu opened 3 years ago

YipengHu commented 3 years ago

Benchmark the selected experiments described in: Balakrishnan, G., Zhao, A., Sabuncu, M.R., Guttag, J. and Dalca, A.V., 2019. Voxelmorph: a learning framework for deformable medical image registration. IEEE transactions on medical imaging, 38(8), pp.1788-1800.

This is related to #3.

Summary:

Tasks: Unsupervised algorithms with segmentation-based weak supervision

Transformation: predicting spatial transformation in DDF

Network and loss: encoder-decoder with 2^4 resampling. unsupervised loss: MSE and LNCCs regulariser: L2-norm disp. gradient label: Dice over all fixed-number of labels difference: leaky_relu;

Data and experiments: 1.atlas-based registration, i.e. register each image to an atlas computed independently

  1. random inter-subject paris
  2. with manual segmentation

Metrics: Dice on warped segmentation maps Jacobian

fepegar commented 3 years ago

Relevant bits for preprocessing:

We also assume that f and m are affinely aligned as a preprocessing step, so that the only source of misalignment between the volumes is nonlinear.

We use a large-scale, multi-site, multistudy dataset of 3731 T1–weighted brain MRI scans from eight publicly available datasets: OASIS [69], ABIDE [70], ADHD200 [71], MCIC [72], PPMI [73], HABS [74], Harvard GSP [75], and the FreeSurfer Buckner40 [1]

That's a lot! I only have half, around 1800 images.

All scans were resampled to a 256×256×256 grid with 1mm isotropic voxels. We carry out standard preprocessing steps, including affine spatial normalization and brain extraction for each scan using FreeSurfer [1], and crop the resulting images to 160 × 192 × 224. All MRIs were anatomically segmented with FreeSurfer, and we applied quality control using visual inspection to catch gross errors in segmentation results and affine alignment. We include all anatomical structures that are at least 100 voxels in volume for all test subjects, resulting in 30 structures. We use the resulting segmentation maps in evaluating our registration as described below.

We should discuss this tomorrow. I have started installing FreeSurfer on the DGX.

mathpluscode commented 3 years ago

Relevant bits for preprocessing:

We also assume that f and m are affinely aligned as a preprocessing step, so that the only source of misalignment between the volumes is nonlinear.

We use a large-scale, multi-site, multistudy dataset of 3731 T1–weighted brain MRI scans from eight publicly available datasets: OASIS [69], ABIDE [70], ADHD200 [71], MCIC [72], PPMI [73], HABS [74], Harvard GSP [75], and the FreeSurfer Buckner40 [1]

That's a lot! I only have half, around 1800 images.

All scans were resampled to a 256×256×256 grid with 1mm isotropic voxels. We carry out standard preprocessing steps, including affine spatial normalization and brain extraction for each scan using FreeSurfer [1], and crop the resulting images to 160 × 192 × 224. All MRIs were anatomically segmented with FreeSurfer, and we applied quality control using visual inspection to catch gross errors in segmentation results and affine alignment. We include all anatomical structures that are at least 100 voxels in volume for all test subjects, resulting in 30 structures. We use the resulting segmentation maps in evaluating our registration as described below.

We should discuss this tomorrow. I have started installing FreeSurfer on the DGX.

Don't worry on the dataset size for the moment :) We can add other datasets later.

fepegar commented 3 years ago

FreeSurfer is installed in the cluster: /share/apps/cmic/freesurfer-7.1.0.

If we need to perform a lot of parcellations, that's probably the way to go. The ones I have are GIF parcellations, which were also computed on the cluster. GIF is fine, but less standard, making our experiments less replicable.

The FreeSurfer parcellation pipeline, called recon-all, has a talairach option so images are registered (affine) to the MNI space (despite being called Talairach, which is slightly different). This would be quite close (or the same) to the preprocessing performed in the paper. The parcellation may be used for evaluation and to perform skull-stripping, as in the paper.

Summarising: we should probably use the cluster to run the FreeSurfer pipeline on all images to 1) be reproducible and 2) follow the paper. That will give us:

  1. Registration to MNI space
  2. Skull-stripping
  3. Brain parcellation, used to evaluate our registrations
fepegar commented 3 years ago

Some more information from voxelmorph:


We encourage users to download and process their own data. See a list of medical imaging datasets here. Note that you likely do not need to perform all of the preprocessing steps, and indeed VoxelMorph has been used in other work with other data.


I'm not sure what those coordinates mean.

acasamitjana commented 3 years ago

I'm not sure what those coordinates mean.

I was looking into that as well. I think the coordinates may refer the cropping points in the 256x256x256 volume from freesurfer (i.e. volume[48:-48, 31:-33, 3:-29]

fepegar commented 3 years ago

That would make sense. Also, I just checked an output of recon-all with TorchIO and it's already 256**3, which I didn't know:

$ tiohd t1_seg_freesurfer.mgz                                                                                   
ScalarImage(shape: (1, 256, 256, 256); spacing: (1.00, 1.00, 1.00); orientation: LIA+; memory: 64.0 MiB; dtype: torch.FloatTensor)
acasamitjana commented 3 years ago

FreeSurfer is installed in the cluster: /share/apps/cmic/freesurfer-7.1.0.

Thanks! I agree that FreeSurfer is better (a more standardized way) than using standalone algorithms for the required steps you already mentioned.

1. Registration to MNI space

2. Skull-stripping

3. Brain parcellation, used to evaluate our registrations

Looking at VoxelMorph papers they only report metrics on the 'aseg' labels (step 11 from recon-all). So, in principle, we may want ro run only autorecon1 and autorecon2 if that helps speed (I know sometimes it can be quite slow).

mathpluscode commented 3 years ago

Regarding the implementation, acc to the paper, we need

we should expect a dice loss around 0.766 for MSE and 0.774 for LNCC loss, so we can start with LNCC loss

Launching experiments should be very simple with CLI tool, but we need to first enable changing the encoder/decoder channels as for now we are assuming the channels are doubled at each layer.

mathpluscode commented 3 years ago

As we are encountering some loss explosion effect for now, we should first use VM on our data first.