Handle undersampling due to lots of images without burned areas - Githubissues

developmentseed / chabud2023

Change detection for Burned area Delineation (ChaBuD) ECML/PKDD 2023 challenge

Other

5 stars 1 forks source link

Handle undersampling due to lots of images without burned areas #12

Open weiji14 opened 1 year ago

weiji14 commented 1 year ago

The extra Sentinel-2 imagery dataset provided in https://huggingface.co/datasets/chabud-team/chabud-extra does not contain any burned areas according to https://huggingface.co/datasets/chabud-team/chabud-extra/discussions/1. If we include these datasets in the training, there will be a severe imbalance in the ratio of burned area to unburned area pixels.

Some potential ways to handle the extra data to improve model performance:

[ ] Loss functions that handle foreground/background classes properly
- [ ] Focal Loss
- [ ] Dice Loss
[ ] Self-supervised pre-training
- [ ] Develop pretext tasks that make use of the extra data, generate useful embeddings on all the given data, and then fine-tune on images with burned areas only