zejinwang / Blind2Unblind

This is an official implementation of the CVPR2022 paper "Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots".
https://arxiv.org/abs/2203.06967
118 stars 13 forks source link

Inconsistency for Testing #15

Closed gauenk closed 1 year ago

gauenk commented 1 year ago

In the paper, Figure 1 shows the inference phase only requires the denoiser. But in the code, testing requires the global masker ($\Omega$). Using only the denoised output yields poor denoising quality.

Can you help me understand -- is the figure in the paper just incorrect?

zejinwang commented 1 year ago

The inclusion of a global mask in the code is designed to allow users to intuitively experience the differences between three elements during the training process. Once the training is complete, only a single denoiser is needed at the inference stage to produce denoised results. Regarding your concern that only the denoiser output is poor, this is an unlikely occurrence. It could be that your training has not been completed, or perhaps an inappropriate learning rate was chosen, resulting in unstable gradients.

gauenk commented 1 year ago

Just to clarify, I am stating that line 515 of "test_b2u.py" includes the mask step. This gives excellent output quality :smile: This line is what you mean when you state: "only a single denoiser is needed at the inference stage to produce denoised results"?

When I just take the output of the model for a single denoised image, the output quality is poor. For instance, if the line 515 is noisy_output = network(net_input) then the restoration quality is poor.

This is totally fine. It just seems not consistent with right-hand side of Figure 1. The "Omega" seems included in the code, but not included in the Figure 1.

I am only checking to make sure I am not missing something.

zejinwang commented 1 year ago

Thank you for your detailed description. In fact, the inference stage in Figure 1 corresponds to this line of code: https://github.com/zejinwang/Blind2Unblind/blob/c76964f2e5f041c6b331195d0af4f3d3f9b51359/test_b2u.py#L516, not line 515 where noisy_output = network(net_input). Line 515 is included to demonstrate the poor performance of the masking auxiliary task. As a result, it serves only as a gradient medium to achieve high-performance implicit visible tasks in line 516.

gauenk commented 1 year ago

Thank you for your speedy replies. This is most helpful. However, your comment seems swapped from what I see in my results.

For me, line 515 is excellent while line 516 is not-as-excellent. Would you mind confirming if this finding is different or the same as your result?

Again, the result in line 515 is very very good and this is overall an impressive training method. I am here only to clarify.

Xessankit13 commented 1 year ago

Whats data sets we can use?

gauenk commented 1 year ago

I use the DAVIS dataset and an internal dataset to my team. The result seems to be the same for both. I will be able to share more details in a few weeks. I just wanted to make sure our findings are not the same

zejinwang commented 1 year ago

Perhaps I already know what you want to share. Actually, mapping the blindspots to construct a single noisy image or not is essentially the same. All that plays a role is the global masking strategy and the re-visible loss. Looking forward to seeing the amazing results.