StanfordMIMI / DDM2

[ICLR2023] Official repository of DDM2: Self-Supervised Diffusion MRI Denoising with Generative Diffusion Models
144 stars 21 forks source link

sampling process #25

Open dsm6666 opened 7 months ago

dsm6666 commented 7 months ago

Hi, I looked closely at the posted code and noticed that the sampling process is different from the original DDPM and the labels for the training process are different, the DDPM is noise and here you have the original noisy image, I would like to ask if the iterative formula for the sampling process is mentioned in any article?

tiangexiang commented 7 months ago

Hi, thank you for your interests in our work! Implementation-wise, this code repository is built upon SR3, please see their paper or codes for references. Thanks!

dsm6666 commented 6 months ago

Thanks for your answer, it was very helpful! I have a few more questions. The dataset you used is a 4D volume [H x W x D x T], while T indicates the number of different observations of the same 3D volume. I'm confused as to what Figure 5 in the text shows for n=1. What does n denote? Is it not the same thing as the T in [H x W x D x T]?

tiangexiang commented 6 months ago

n shown in figure 5 means the number of slices (at the same index but in different volumes) to be used as inputs to the network in one forward pass. So n should be strictly less or equal to T, and in our implementation we used n=2.

dsm6666 commented 6 months ago

Thanks for the quick reply, it was very helpful, but I have a couple more questions.

  1. is it safe to assume that the better the denoising result in the first stage (the better the result for larger n), the better the final result in the third stage?
  2. in the second stage, each slice corresponds to only one intermediate state in the diffusion process, since the first stage has only one denoising result and corresponding noise for each slice.
  3. if it is to be used for CT denoising and the data is only three-dimensional [H x W x D ] instead of four-dimensional [H x W x D x T ], and unsupervised learning such as dip is used in the first stage, will the result be worse? Because such a slice only corresponds to an intermediate state of the diffusion process, resulting in insufficient data. Because according to the article, each slice has T observations, which also corresponds to T intermediate states of the diffusion process.
tiangexiang commented 6 months ago
  1. Yes. Better denoising result (better noise model) definitely helps in Stage2, so the states can be matched more precisely. However, I would say Stage3 is the most critical component in terms of final denoising quality. In this implementation, we only used the most basic DDPM, which can be replaced with the most up-to-date diffusion models to improve denoising quality.
  2. Yes. We use pixel-wise noise at all pixels for each single slice to build slice-wise noise models (each slice has a different std in Gaussian noise model).
  3. It is a good question. The biggest challenge of training on CT data is that the J-invariance optimization (Eq. 2) may not be suitable anymore, since there are not two representations for the same data (either pixel of slice). Whenever you can figure out an alternative optimization target for Stage3, I think it will have great potentials and be able to denoise CT data.