nagejacob / SpatiallyAdaptiveSSID

Spatially Adaptive Self-Supervised Learning for Real-World Image Denoising (CVPR 2023)
GNU General Public License v3.0
86 stars 7 forks source link

Will noise be considered part of the image? #12

Closed TwilightArchon closed 3 months ago

TwilightArchon commented 3 months ago

Hello! Thank you for your awesome work! I'm using self supervised network for image denoising. Because paired images are very difficult to collect under real world settings, I want to overfit to noise, so I want to train on my own image set. However, if all the images are from one noise distribution, will the network consider noise as part of the image? Also, I'm wondering what might be a good training set size? And do we need to train 400k iterations? Thank you!

nagejacob commented 3 months ago

I prefer the network won't be overfitted to the noisy images even if they are from the same noise distribution, as we tried our method on other noisy images except for SIDD and DND. I think 100 images larger than $1920\times 1080$ are sufficient to train the network from scratch. It took <3 days to train total 1200k iterations (400k for each stage) with patch size $256\times 256$ on singel RTX3090. if you want to narrow the training time, just reduce the patch size to $128\times 128$ for minor performance drop.

TwilightArchon commented 3 months ago

Thank you so much for your answer! Right now I'm trying to train the network. During training, the image is cropped to 256*256, which handles the downsample and upsample correctly. But the decoder might output different sizes after each layer, for example, the 3 stage of encoding has residual connection with 3 stage of decoding, and one of them has width 175, and the other has width 174. and it causes issues with concatenation and upsampling. I'm wondering should I just interpolate them to the shape to be able to concatenate, or is there a better/easier way, since there might be a lot of places to add this interpolation?

nagejacob commented 3 months ago

How could $256\times 256$ downsampled to 175/174? There should not be such problems.

TwilightArchon commented 3 months ago

You are right, 256 * 256 will not have such problems, but what about an image which is not square? Should I clip them into squares and run each square independently, and stitch them back together? Thank you!

nagejacob commented 3 months ago

For inference, you could reflect pad the input image to square and crop the denoised result.

TwilightArchon commented 3 months ago

Thanks! I'll try that.