NRCan / geo-deep-learning

Deep learning applied to georeferenced datasets
https://geo-deep-learning.readthedocs.io/en/latest/
MIT License
150 stars 49 forks source link

Test-time augmentation: noise, blur, brightness, contrast, geometric scaling, etc. #106

Closed remtav closed 3 years ago

remtav commented 4 years ago

Wouldn't trained models gain robustness if data augmentations (utils/augmentation.py) included brightness and contrast shifts (ex.: random, 10-15%)? I imagine it could also help with overfitting issues.

mpelchat04 commented 4 years ago

That's true for satellite imagery. But since brightness and contrast shifts are only histograms modifications, there are a few considerations:

  1. EO data other than satellite or aerial imagery (e.g. derived products from DEMs [Slope, aspect, etc.]) have specific data range and I'm not convinced that messing with these histograms would really help.
  2. We could add this feature as a parameter in the config file. But we would have to be able to turn it on/off for each band. For example, let say we use as input data images with RGB+slope. We'd like to apply contrast and/or brightness shifts on the RGB bands but probably not on the slope band...
  3. We would probably have to manage those functions ourselves, since data augmentations libraries such as Albumentations only manage 3bands data (for now).

It's worth the discussion, tough. I think the first step would be to test the first hypothesis and work from there. I'll add this to our "tests wishlist".

Math

remtav commented 4 years ago

Augmentration techniques should be reviewed (noise, blur, geometric scaling, etc.). Overall, there could be a slight improvement of results if different augmentation strategies were tested. For example, test-time augmentation could be interesting to try out.

This paper deserves a look: https://link.springer.com/article/10.1186/s40537-019-0197-0

Edited June 25th 2020: Mathieu, the point you raised is important to keep in mind. Currently random radiometric trim augmentation is applied to all bands at once. We'll have to think at how it could be applied to only a portion of those bands. First, I imagine we'd have to inform GDL what bands it will be seeing by their name in what order (e.g. RGB, not band 012 or 123), then identify which bands will need to be augmented. To be discussed.

remtav commented 4 years ago

This implementation good be a good start: https://github.com/SpaceNetChallenge/SpaceNet_SAR_Buildings_Solutions/blob/master/2-MaksimovKA/predict/tta.py