Open Luke2642 opened 8 months ago
Hi Luke of course, I will take a look at it. As far as the others go, i have a custom unet/convnext architecture i havent pushed due to lack of time.
Στις Πέμ 21 Μαρ 2024 στις 12:18 μ.μ., ο/η Luke Perkin < @.***> έγραψε:
A fantastic repository, thank you.
I'm just getting started with it, but I just thought I'd reach out and ask if you'd accept a pull request that trains for human percetual quality rather than MAE / PSNR in future?
I'm thinking a simple way to achieve a partial solution is to retrain on images in a colourspace like OKLAB where perceptual difference is baked in, and the perceived colour difference formula is trivial, instead of monstrous!
I was also thinking 'edge loss' or extra channels for the H&V image gradients during training with L1 loss, discarded after, could be good.
A rolls royce solution might be adversarial loss, perhaps using a secondary network like Netflix vmaf or something?
If there are any resources related to perceptual quality rather than PSNR, please do point me in the right direction :-)
— Reply to this email directly, view it on GitHub https://github.com/NikolasMarkou/blind_image_denoising/issues/16, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI7CDHN62AMHUD2JVDYEYDYZKXWVAVCNFSM6AAAAABFBESN3CVHI2DSMVQWIX3LMV43ASLTON2WKOZSGE4TSNZVGEZDKNI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Computer Engineer, Machine Learning / Artificial Intelligence Engineer
Thanks! I'm glad 1 & 2 seem sensible and interesting to you!
Your four targets are really well chosen, really focused on light weight and fast. I've just realised now, today, that 3 & 4 are actually a new target: to train a blind restoration diffusion model that focuses on boosting the perceptual quality of noisy natural images, not just reducing statistical noise.
Searching for those keywords, "blind restoration diffusion model", I just found:
DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior https://github.com/XPixelGroup/DiffBIR
But it is very heavy, even the lightest is 90mb ish, and uses a transformer, so heavier on compute. It will be exciting to see how close we can get with less than 5mb of parameters in a bias free CNN. Diffusion models are so incredibly amazing though, I was blown away by the Zahra/Florentin/Eero/Stéphane recent work, which follows on from BF-CNN:
https://arxiv.org/abs/2310.02557 https://www.youtube.com/watch?v=V_t6QppPbwQ
A fantastic repository, thank you.
I'm just getting started with it, but I just thought I'd reach out and ask if you'd accept a pull request that trains for human percetual quality rather than MAE / PSNR in future?
I'm thinking a simple way to achieve a partial solution is to retrain on images in a colourspace like OKLAB where perceptual difference is baked in, and the perceived colour difference formula is trivial, instead of monstrous!
I was also thinking 'edge loss' or extra channels for the H&V image gradients during training with L1 loss, discarded after, could be good.
A rolls royce solution might be adversarial loss, perhaps using a secondary network like Netflix vmaf or something?
If there are any resources related to perceptual quality rather than PSNR, please do point me in the right direction :-)