How is T rounds denoising realized in code?（T轮去噪是如何体现的）

RipeMangoBox commented 5 months ago

Thank you for your valuable contribution! I have a question regarding the implementation. From my understanding, DDPM involves denoising x_t to obtain x_0 over t steps. However, I couldn't locate the corresponding loop code within the project. Is it possible that this part has been simplified to perform one-step denoising, resembling the approach used in a Variational Autoencoder (VAE)?

def forward(self, x_0):
    """
    Algorithm 1.
    """
    t = torch.randint(self.T, size=(x_0.shape[0], ), device=x_0.device)
    noise = torch.randn_like(x_0) # 由于T->∞时，z_t->N(0, I)，所以取标准高斯作为z_t，然后一步得到x_t
    x_t = (
        extract(self.sqrt_alphas_bar, t, x_0.shape) * x_0 +
        extract(self.sqrt_one_minus_alphas_bar, t, x_0.shape) * noise)
    loss = F.mse_loss(self.model(x_t, t), noise, reduction='none')
    return loss

In the forward function of the GaussianDiffusionTrainer class in Diffusion.py, a random integer list t is generated, representing different T values such as [5, 2, 0, 1, 5, 8, 15, ...]. These values are utilized in denoising through UNet. However, I couldn't find a loop that takes into account the t steps either in Diffusion.py or Train.py.

Does anyone can help me?

非常感谢您的出色工作！根据我的有限知识，DDPM 使用 t 步骤来去噪 x_t 到 x_0，但我在项目中找不到相应的循环代码。这部分是否被简化为像 VAE 一样的单步去噪？

def forward(self, x_0):
    """
    Algorithm 1.
    """
    t = torch.randint(self.T, size=(x_0.shape[0], ), device=x_0.device)
    noise = torch.randn_like(x_0) # 由于T->∞时，z_t->N(0, I)，所以取标准高斯作为z_t，然后一步得到x_t
    x_t = (
        extract(self.sqrt_alphas_bar, t, x_0.shape) * x_0 +
        extract(self.sqrt_one_minus_alphas_bar, t, x_0.shape) * noise)
    loss = F.mse_loss(self.model(x_t, t), noise, reduction='none')
    return loss

以上是 Diffusion.py 中 class GaussianDiffusionTrainer 中的 forward 函数。 t 是一个随机整数列表，如 [5, 2, 0, 1, 5, 8, 15, ...]，其值被视为不同的 T，然后传递给 UNet 进行去噪。我在 Diffusion.py 和 Train.py 中都找不到考虑 t 步骤的循环。

求佬助！

zoubohao commented 4 months ago

The code you display is the training processing. The inference procedure is in the "forward" function of "GaussianDiffusionSampler" class of the Diffusion.py file.

chenchen278 commented 4 months ago

训练时并没有遍历所有的时间t，而是对时间进行采样，只对某部分t到t-1的过程进行降噪训练。在采样时才是按时间遍历进行降噪

zoubohao / DenoisingDiffusionProbabilityModel-ddpm-

How is T rounds denoising realized in code?（T轮去噪是如何体现的） #39