Question about the loss of Ldiffuse

Thank you for your interest in our work.

The denoising diffusion model's output, known as the 'score function', is actually the gradient of the log probability density with respect to data. This definition is introduced in the paper 'Score-Based Generative Modelling through Stochastic Differential Equations' by Yang Song, et al, ICLR 2021. In our case, I believe that the output of our diffusion decoder is the gradient of the log probability density with respect to the joint input {f,m,x(perturbed image)}. In the model training stage, the target function is the re-weighted variant of the ELBO. I refer you to Equ. 3, Equ. 4 and their explanations (Page3) of this paper for details. Based on these evidences, the output of our diffusion decoder is probably not simply fitting the linear-scheduled noises.

Reference: Score-Based Generative Modeling through Stochastic Differential Equations | OpenReview](https://openreview.net/forum?id=PxTIG12RRHS)

xmed-lab / FSDiffReg

Question about the loss of Ldiffuse #5