xmed-lab / FSDiffReg

MICCAI 2023: FSDiffReg: Feature-wise and Score-wise Diffusion-guided Unsupervised Deformable Image Registration for Cardiac Images
24 stars 7 forks source link

Question about the loss of Ldiffuse #5

Open zsw1456419654 opened 10 months ago

zsw1456419654 commented 10 months ago

Thank you for your excellent work on deformable image registration! When I read your paper, it confuses me that the meanings of Ldiffuse. In your paper, Ldiffuse is seemed to predict the noise (e) added on the image from my perspective. As a result, the diffusion score S should close to the noise (e) rather than the map which indicate the hard-to-register areas. I'll be appreciated if you could answer my question!

Eason-Qin commented 3 weeks ago

Thank you for your interest in our work.

The denoising diffusion model's output, known as the 'score function', is actually the gradient of the log probability density with respect to data. This definition is introduced in the paper 'Score-Based Generative Modelling through Stochastic Differential Equations' by Yang Song, et al, ICLR 2021. In our case, I believe that the output of our diffusion decoder is the gradient of the log probability density with respect to the joint input {f,m,x(perturbed image)}. In the model training stage, the target function is the re-weighted variant of the ELBO. I refer you to Equ. 3, Equ. 4 and their explanations (Page3) of this paper for details. Based on these evidences, the output of our diffusion decoder is probably not simply fitting the linear-scheduled noises.

Reference: Score-Based Generative Modeling through Stochastic Differential Equations | OpenReview](https://openreview.net/forum?id=PxTIG12RRHS)