Open man-sean opened 1 year ago
For (2), I think the authors apply the normalization factor before taking the gradient. If you look at ConditioningMethod.grad_and_value
(here), they take the gradient of the norm, not the norm squared.
I believe there's another difference between Alg. 1 of the paper and the code. In EpsilonXMeanProcessor.predict_xstart
(here), the coefficient applied to the score-model output is different from the coefficient in line 4 of Alg. 1. In the paper, the coefficient is $(1-\bar{\alpha}_i)/\sqrt{\bar{\alpha}_i}$, but in the code, it is $-1/\sqrt{\bar{\alpha}_i-1}$.
@berthyf96, for your second point regarding "EpsilonXMeanProcessor.predict_xstart", I also did not understand the difference until I realized that the score function $\widehat{s}(xt)$ associated with a noise predictor $\epsilon\theta(x_t)$ is: $$\widehat{s}(xt) = \nabla{xt} \log p\theta(x_t) = - \frac{1}{\sqrt{1-\bar{\alphat}}} \epsilon\theta(x_t) $$ See Equation (11) here. Injecting this result into the expression of $\widehat{x}_0$ of Alg 1 gives the implemented results.
@claroche-r thanks so much for clarifying that!
thank you!
Hi, There are a few differences between the paper and this repository and it will be wonderful if you could clarify for me the reasons behind them:
sigma_y=0.05
, and indeed in the config filesconfig['noise']['sigma']=0.05
. But while the images are stretchered from [0,1] to [-1,1], the sigma is unchanged – meaning that in practice the noise added is with stdsigma/2
, i.e.y_n
is cleaner compared to the reported settings in the paper. This can be easily checked by computingtorch.std(y-yn)
after the creation ofy
andy_n
insample_condition.py
.config['conditioning']['params']['scale']
and used inPosteriorSampling.conditioning()
to scale the gradient, but we never normalized the gradient in the first place (inPosteriorSampling.grad_and_value()
for example). By adding the gradient normalization the method seems to break.configs/super_resolution_config.yaml
uses 0.3.Thank you for your time and effort!