Some experimental results inaccurate due to a problem in training code

TomHeaven commented 1 year ago

The paper proposes a novel method for self-supervised image denoising. However, I find some experimental results inaccurate due to a problem in training code.

The problem is that clean images should not be used in training epochs. In your current implementation, noise are randomly generated and added to clean images in every epoch. In this way, you get different noisy images from the same clean images by adding random noise across different epochs. In fact, the training scheme somehow falls back to "noise2noise" across epochs.

The correct implementation is:

First, generate the noisy images and save them on disk.
Second, read the generated noisy images to feed to the CNN and never read any clean images (except for validation) in the training scheme.

I repeat the experiment in the corrected implementation and find the experimented results on Gaussian ($\sigma=25$) noisy images as:	Dataset	PSNR
Kodak	31.81	0.8668
BSD300	30.59	0.8610
Set14	30.56	0.8408

(Using model parameters from training epoch 91)

The PSNR results are around 0.4-0.5 dB less than the numbers reported in Table 1 of the paper.

alwaysuu commented 1 year ago

Hi, I think the noise2noise can work in the situation where paired noisy realizations of same latent image are available in every training iteration, but "noise2noise" across epochs you mentioned should not work. The reason of generating random noisy images in each epoch is to keep the number of noise samples sufficiently large, thus ensuring good performance.

How many noisy images did you use to train the model? I think the similar performance like the neighbour2neighbour can be achieved if you use a larger clean database (for example, 1M images) to generate noisy images.

TomHeaven commented 1 year ago

Hi, I understand your idea. Expanding the training set may improve performance or not (due to the domain gap between the training set and test set). The key to my concern is that we should not have a chance to observe a test image multiple times, even with different additive noises. For example, you can use a simple average of the multiple noisy observations to estimate the clean image. That is to say, multiple noisy observations of a clean image can also leak clean image information to the model/algorithm. Therefore, the input also benefits neural networks through the leak. @alwaysuu

TaoHuang2018 / Neighbor2Neighbor

Some experimental results inaccurate due to a problem in training code #16