Royalvice / DocDiff

ACM Multimedia 2023: DocDiff: Document Enhancement via Residual Diffusion Models. Also contains 1597 red seals in Chinese scenes, along with their corresponding binary masks.
https://www.aibupt.com/
MIT License
196 stars 21 forks source link

About config.PRE_ORI #20

Closed CarlBhy closed 7 months ago

CarlBhy commented 7 months ago

Hi Yang, thank you for your kind words about DocDiff! As mentioned in the readme under "Notes!", we have been working on applying DocDiff to natural scenes with pattern diversity. We made modifications to the config.yml file by setting PRE_ORI: 'False' and TIMESTEPS: 1000. However, we encountered some problems.

In the trainer.py file of the DocDiff code, specifically lines 189 to 194, we have the following code snippet:

if self.pre_ori == 'True': if self.high_low_freq == 'True': residual_high = self.high_filter(gt.to(self.device) - init_predict) ddpm_loss = 2*self.loss(self.high_filter(noise_pred), residual_high) + self.loss(noise_pred, gt.to(self.device) - init_predict) else: ddpm_loss = self.loss(noise_pred, gt.to(self.device) - init_predict) else: ddpm_loss = self.loss(noise_pred, noise_ref.to(self.device)) When self.pre_ori is set to 'False', the ddpm_loss causes noise_pred to learn noise_ref. However, noise_ref represents the added noise. During the training stage, the visualization of 'noise_pred.cpu() + init_predict.cpu()' will result in a noisy init_prediction!

It seems that the issue lies in the visualization step, where the noisy init_prediction is being displayed.

CarlBhy commented 7 months ago

image

CarlBhy commented 7 months ago

I'm looking forward to your reply as well. Thank you again for your outstanding work.

Royalvice commented 7 months ago

是的,你说的没错,这里是有问题。我会修改代码。也欢迎您把您在自然场景复现的结果进行pull,使DocDiff更加全面。

Royalvice commented 7 months ago

具体来说,要先通过逆向公式来计算出residual再相加

CarlBhy commented 7 months ago

好的,感谢您的回复!还有一个问题,为什么在做测试的时候,很多图像存在这种颜色差异很大的问题,不知道您是否遇到过类似的问题。 image

image