kornia / kornia

Geometric Computer Vision Library for Spatial AI
https://kornia.readthedocs.io
Apache License 2.0
9.88k stars 964 forks source link

[Feature request] Bayer preserving augmentation techniques #1445

Open copaah opened 2 years ago

copaah commented 2 years ago

🚀 Feature

Learning directly from raw bayer images is an exciting idea that will take the entire learning step completely end-to-end. However, doing this presents it own challenges.

One of them is to preserve bayer patterns when doing common image augmentation techniques such as flipping images.

This feature request to add to Kornia a set of augmentation techniques that preserve bayer patterns and thus allow deep learning models that train the network directly on raw bayer images to continue to benefit from augmentation techniques.

Motivation

Research into learning from raw bayer images is exciting new field and adding support for augmentation in Kornia can be a good way to aid this research.

Pitch

In the kornia.augmentation module add two new classes:

def RandomVerticalFlipBayerPreserving(return_transform=None, same_on_batch=False, p=0.5, p_batch=1.0, keepdim=False)

def RandomHorizontalFlipBayerPreserving(return_transform=None, same_on_batch=False, p=0.5, p_batch=1.0, keepdim=False)

Alternatives

N/A

Additional context

See this GitHub repo that already implements it:

edgarriba commented 2 years ago

We recently added color conversions on that direction: https://kornia.readthedocs.io/en/latest/color.html#bayer-raw Verify that could work with the proposed ideas. /cc @oskarflordal @shijianjian

oskarflordal commented 2 years ago

What is suggested is doing augmenations directly on the bayer image, i.e. without conversions. Given the limitations of augmentations (cropping needed for flips/translation, rotation not possible without breaking the cfa) and the nature of what an augmentation is: What would be the strong case for not doing e.g. (bayer -> color) -> augment -> (color -> bayer), ideally with a higher quality debayer and a bit of noise at the end? My gut feel is that would give you better variations and given it is augmentations it doesn't affect the main use case of going directly from sensor to network. If you are looking to mimic e.g. fix panel noise or similar that is going to be broken by the augmentation anyway. With such a pipe in place it is also easier to reuse annotated RGB-data. I don't know enough about the augmentationpipeline itself to understand how invasive it would be, I guess you would want to allow only certain augmentations noise (ideally color aware), flipping/translate, random crops (on 2x2 borders), image degrading augs like stuck at 1/0 pixels and other things that can be done per pixel/area. @edgarriba

copaah commented 2 years ago

@oskarflordal

This is probable a stupid question, but how will you go from color to bayer without keeping track of the CFA throughout the augmentation pipeline?

oskarflordal commented 2 years ago

(note that I am not really part of the core Kornia team, I am just interested in understanding the use case) Not sure if you need to keep track of it? You can go from rgb representation to raw with any CFA, if you train your algorithm for a specific sensor I suppose you convert using that fixed CFA no matter what data source you have. Obviously, going from a raw image you would loose a lot of the sensor specific defects on the way through the augmentation pipeline. Thinking more about it I guess there are two main cases were you need raw augmentation

shijianjian commented 2 years ago

Just looked at the paper, my question is would the bayer-preserving augmenations still be helpful if combined with those common non-bayer-preserving methods (e.g. rotation, scaling)?

From the implementation perspective, I think it is better to make it out of the current augmentation pipeline, as

class RandomVerticalFlipBayerPreserving

class RandomHorizontalFlipBayerPreserving

Then it can work like:

AugmentationSequential(
    RandomVerticalFlipBayerPreserving(),
    RandomHorizontalFlipBayerPreserving(),
    preprocessing=RgbToRaw(),
    postprocessing=RawToRGB(),
)