Royalvice / DocDiff

ACM Multimedia 2023: DocDiff: Document Enhancement via Residual Diffusion Models. Also contains 1597 red seals in Chinese scenes, along with their corresponding binary masks.
https://www.aibupt.com/
MIT License
196 stars 21 forks source link

有关梯度回传的问题 #16

Closed yrqs closed 8 months ago

yrqs commented 8 months ago

作者您好,有个地方想向您请教一下。 在DocDiff类的forward中,有x__ = self.denoiser(torch.cat((noisy_image, x_.clone().detach()), dim=1), t),从代码来看,我目前理解的是noisy_image依然是带有x_的信息的,即最终的预测还是会将扩散的loss回传到第一个Unet,但是后面的x_却detach了一下,所以想问一下这里是有特意设计吗?谢谢!

Royalvice commented 8 months ago

您好,感谢您关注DocDiff工作。detach的操作是根据实验效果设计的。具体来说:通过detach后最终的效果有轻微的提升。从实验结果来说,应该是如果x带有第一个Unet参数的梯度参与反向传播,会使第一个Unet的结果出现颜色偏差(指导过度)。我在论文里也有一句话提到:Exactly, the gradient from the loss only flows through $x{res}$ from 𝑓𝜃 to 𝐶𝜃 .

yrqs commented 8 months ago

哦哦这样,感谢您的回复!