Closed gauenk closed 1 year ago
The inclusion of a global mask in the code is designed to allow users to intuitively experience the differences between three elements during the training process. Once the training is complete, only a single denoiser is needed at the inference stage to produce denoised results. Regarding your concern that only the denoiser output is poor, this is an unlikely occurrence. It could be that your training has not been completed, or perhaps an inappropriate learning rate was chosen, resulting in unstable gradients.
Just to clarify, I am stating that line 515 of "test_b2u.py" includes the mask step. This gives excellent output quality :smile: This line is what you mean when you state: "only a single denoiser is needed at the inference stage to produce denoised results"?
When I just take the output of the model for a single denoised image, the output quality is poor. For instance, if the line 515 is noisy_output = network(net_input)
then the restoration quality is poor.
This is totally fine. It just seems not consistent with right-hand side of Figure 1. The "Omega" seems included in the code, but not included in the Figure 1.
I am only checking to make sure I am not missing something.
Thank you for your detailed description. In fact, the inference stage in Figure 1 corresponds to this line of code: https://github.com/zejinwang/Blind2Unblind/blob/c76964f2e5f041c6b331195d0af4f3d3f9b51359/test_b2u.py#L516, not line 515 where noisy_output = network(net_input). Line 515 is included to demonstrate the poor performance of the masking auxiliary task. As a result, it serves only as a gradient medium to achieve high-performance implicit visible tasks in line 516.
Thank you for your speedy replies. This is most helpful. However, your comment seems swapped from what I see in my results.
For me, line 515 is excellent while line 516 is not-as-excellent. Would you mind confirming if this finding is different or the same as your result?
Again, the result in line 515 is very very good and this is overall an impressive training method. I am here only to clarify.
Whats data sets we can use?
I use the DAVIS dataset and an internal dataset to my team. The result seems to be the same for both. I will be able to share more details in a few weeks. I just wanted to make sure our findings are not the same
Perhaps I already know what you want to share. Actually, mapping the blindspots to construct a single noisy image or not is essentially the same. All that plays a role is the global masking strategy and the re-visible loss. Looking forward to seeing the amazing results.
In the paper, Figure 1 shows the inference phase only requires the denoiser. But in the code, testing requires the global masker ($\Omega$). Using only the denoised output yields poor denoising quality.
Can you help me understand -- is the figure in the paper just incorrect?