sony / ctm

227 stars 12 forks source link

Potential typo in the paper #1

Closed pkhungurn closed 8 months ago

pkhungurn commented 11 months ago

This comment is based on the arXiv PDF (https://arxiv.org/pdf/2310.02279.pdf) that I downloaded on 2023/10/30.

The 3rd page says "Here, $E{p{t0}(\mathbf{x}|\mathbf{x}_t)}$ is the denoiser function, an alternative expression for the score function $\nabla \log p_t(\mathbf{x}_t)$." The authors cite Bradley Efron's paper on Tweedie's formula as a justification.

ctm-typo

I think this is wrong. Assuming that $E{p{t0}(\mathbf{x}|\mathbf{x}_t)} = \nabla \log p_t(\mathbf{x}_t)$ as the above sentence said, we would have that

\frac{\mathrm{d}\mathbf{x}_t}{\mathrm{d}t} = \frac{\mathbf{x}_t - E_{p_{t0}(\mathbf{x}|\mathbf{x}_t)}[\mathbf{x}|\mathbf{x}_t] }{t} = \frac{\mathbf{x}_t - \nabla \log p_t(\mathbf{x}_t) }{t} \neq - t \nabla \log p_t(\mathbf{x}_t).

Instead, it should have been

E_{p_{t0}(\mathbf{x}|\mathbf{x}_t)}[\mathbf{x}|\mathbf{x}_t] = \mathbf{x}_t + t \nabla \log p_t(\mathbf{x}_t),

which will make the equation above holds.

Indeed, the above statement also agree with Tweedie's formula, which states that

If $\mathbf{x}$ and $\mathbf{y}$ are random variables such that $\mathbf{y} = \mathbf{x} + \boldsymbol{\xi}$ where $\boldsymbol{\xi} \sim \mathcal{N}(\mathbf{0},\sigma^2 I)$, then $$E[\mathbf{y}|\mathbf{x}] = \mathbf{x} + \sigma^2 \nabla \log p(\mathbf{x}).$$

Because $\mathrm{d} \mathbf{x}_t = \sqrt{2t}\, \mathrm{d} \mathbf{w}$, we have that $p(\mathbf{x}_t | \mathbf{x}_0) \sim \mathcal{N}(0, tI)$. In other words, $\mathbf{x}_t = \mathbf{x}_0 + \boldsymbol{\xi}$ where $\boldsymbol{\xi} \sim \mathcal{N}(0, tI)$. Tweedie's formula thus gives

E_{p_{t0}(\mathbf{x}|\mathbf{x}_t)}[\mathbf{x}_0|\mathbf{x}_t] = E[\mathbf{x}_0|\mathbf{x}_t] = \mathbf{x}_t + t \nabla \log p(\mathbf{x}_t) =  \mathbf{x}_t + t \nabla \log p_t(\mathbf{x}_t).
ChiehHsinJesseLai commented 10 months ago

Hi @pkhungurn,

Thank you for your careful read of our paper. Please kindly notice that we were not saying the denoiser "is" the score function but we say it is an alternative expression.

To avoid confusion, we will make the statement clearer by saying:

image

JiePKU commented 4 months ago

Hi @pkhungurn,

Thank you for your careful read of our paper. Please kindly notice that we were not saying the denoiser "is" the score function but we say it is an alternative expression.

To avoid confusion, we will make the statement clearer by saying:

image

Hi, @ChiehHsinJesseLai , it seems that this equation image is not reasonable when we substitute it in the second equation.
image Is there anything wrong? or something I miss?