sct-pipeline / csa-atrophy

Evaluate the sensitivity of atrophy detection with SCT
https://csa-atrophy.readthedocs.io/
MIT License
1 stars 0 forks source link

Implement torchIO for transformations #107

Open PaulBautin opened 3 years ago

PaulBautin commented 3 years ago

Up to now we use script affine_transform to implement random rigid transformations that mimic subject repositioning for each scaling. This method works well but is not very realistic as it may oversimplify image acquisition reality (i.e., non-realistic intra-subject variability).

An improvement would be to augment the dataset with things like: random noise, random elastic transformation, and several artifacts (all part of torchIO).

@jcohenadad , what do you think? An alternative would be to use the data augmentation tools that were implemented in ivadomed to train models (https://github.com/ivadomed/ivadomed/blob/master/ivadomed/transforms.py ).

A remaining issue is to implement biologically realistic transformations, (e.g. choice of max_displacement for random elastic deformations, SC cord lesions...).

jcohenadad commented 3 years ago

I would suggest to dig a little into the type of artifacts available with torchIO and then decide.

Adding synthetic MS lesions is a good idea, but it is unclear what is the best way to do it. Maybe with GANs, as @naga-karthik mentioned

PaulBautin commented 3 years ago

Cool idea! If i understand correctly (to simulate lesions in the SC):

Pre-processing:

  1. We need two databases of images (one with MS and the other without)
  2. Normalize images from both datasets by registering images to a template (PAM50)?
  3. Segment lesions on both databases

GAN (Generative Adversarial Nets ):

  1. input baseline segmentation (segmentation without MS lesions) in G (Generator NN)
  2. The output of G is a simulation of the segmentation at a further time-point (starts with random noise)
  3. input the output of G into D (discriminator NN)
  4. The output of D is a value indicating the probability that the segmentation was generated by G rather than from baseline with MS

Note: both networks are trained simultaneously similar to a two player minimax game. The game ends when most outputs of G are classified by D as real (from baseline with MS).

Articles using GAN to mimic atrophy:

@jcohenadad, i could present a plan to implement something similar in a meeting?

jcohenadad commented 3 years ago

@jcohenadad, i could present a plan to implement something similar in a meeting?

sure! at our next ivadomed meeting for example

naga-karthik commented 3 years ago

@PaulBautin So, I might have a few suggestions on how to synthesize with GANs. There are many types of GANs (here's a list, note that it is not up-to-date). The reason I put that list is that the GAN that you have mentioned is the most basic so there might be a more advanced or even better way of synthesizing images.

If I understand the original GAN correctly, I don't think G sees any input other than random noise (I'm not sure what you mean by point 4), the baseline segmentation only enters the picture only with D because it is compared with G's output within the adversarial loss function (which you correctly mentioned). About the minimax game, AFAIK it is only in theory that the game "ends" (see Nash equilibrium) but in practice, your loss function never converges because if you think about it - G is minimizing the loss function while D is trying to maximize it. So, the stationary point will be a saddle point and not a local/global minimum. In practice, GAN training is unstable and quite subjective as a result. So, "when to stop training" really depends on the quality of the images generated. If you are interested in this more - there's this famous paper which lays out the standard "guidelines" for training GANs to get good results.

Please let me know if something isn't clear. I'd be happy to discuss. Also, I realize that I was being too picky with my response, please do not mind (I did not intend that). I was just hoping for us to be on the same page because GANs can get a bit tricky to understand.

PaulBautin commented 3 years ago

Hello @naga-karthik and @jcohenadad, I have been testing different artifacts within TorchIO to implement transformations that simulate more "realistic" intra-subject (scan-rescan) variability. I now have to choose the artifacts and transformations to keep.

Two protocols that may allow for more "realistic" outcomes are:

  1. Compute an intra-subject variability for each artifact -- Pro: this would provide insight on which simulated artifact affects the segmentation the most. Con: Most images probably have several artifacts/transformations on the same image.
  2. Compute and implement an artifact occurrence rate and a transformation max displacement . For example: Could the difference between warping fields (from scan-rescan experiments), computed during registration to template, provide insight into the max displacements needed to use the RandomElasticDeformation function? (https://torchio.readthedocs.io/transforms/augmentation.html)

Have statistics on artifact occurrence rates been published? Do you have suggestions to obtain a more realistic data-augmentation?