Open geohacker opened 1 year ago
Training Strategy
We are labeling the segmentation masks from scratch & given the complexity of differentiating between the classes of interest to RM, it is taking us quite some time to generate the chips.
In the allocated budget of ~125 hours, we can generate approximately 1000 chips of size 256x256 as ground truth for our model. This is not sufficient to build a decent segmentation model for all the eight corridors.
As a workaround for this we are trying a weakly supervised training approach, in this case:
After we have a model that is pre-trained with weakly supervised labels, we can then fine-tune the model on chips generated by our data team i.e more precise & designed for sentinel imagery.
Data Distribution
0: "other",
1: "Bosque",
2: "Selvas",
3: "Pastos",
4: "Agricultura",
5: "Urbano",
6: "Sin vegetación aparente",
7: "Agua",
8: "Matorral",
9: "Suelo desnudo",
10: "Plantaciones",
11: "Otras coberturas",
12: "Vegetación caducifolia",
The numbers in the diagram represent the number of pixels for each LULC class in that particular corridor. As we can see from the figure, there is severe class imbalance across all the corridors with Bosques
, Selvas
& Agricultura
dominating in most of the cases.
Few things to consider while training:
Initial PEARL Model for Reforestamos
PEARL models for NAIP imagery was built on top of PyTorch & used segmentation models like UNet, FCN & DeepLab.
I am building the baseline model using PyTorch & PyTorch-Lightning, this takes care of both the science & engineering side of things. We have to write less boilerplate code and things like storing model checkpoints, logs loss curves, metrics etc come for free. We can easily scale the model to run on single/multiple CPU/GPU/TPU without any additional effort.
Update as on 30 Jan, 23
We have a segmentation model that is trained on a single corridor with weakly supervised labels coming from the RM team.
Architecture - Unet
Backbone - EfficientNet-B0 pre-trained on ImageNet
Epochs - 10
Dataset - 1700 chips for training & ~400 chips for testing (with LULC labels from RM)
Loss - Dice Loss 0.47
Score - Jaccard Index 0.6
Here are some sample results
Image (Color Corrected), Ground Truth Mask, Predicted Mask, Image overlay with mask
We have a baseline model that is a DeepLabv3+ with timm-efficientnet-b5 backbone which has an weighted F1 score of 0.78 currently deployed as Mexico LULC pre alpha in PEARL backend. This models also handles issues mentioned here #47 by using color based augmentations.
Clouds are creating confusion for the model. This is understandable, we filtered mosaics with no cloud cover & trained out model on that. This can be fixed by:
Edge effects getting introduced as the model is looking at very small patch of the imagery. Our ground truths are representations of what an area looks like & not exact pixel match for the classes, the model learns from its surrounding pixels & infers the results. When we constrain that to just 256x256 tiles, it sometimes doesn't have enough information & thus creates the edge effects, look at the red pixels at the bottom left of model prediction mask.
Few ways to handle this:
Improve model retraining workflow
Infer of larger tiles (should be easy to implement)
Retrain model to improve accuracy
@developmentseed/pearl
@srmsoumya What are your thoughts about closing this ticket? I think we managed to achieve most of what you outlined as improvements. We can revise/reopen based on feedback.
For our Sentinel release, we'll create a starter model based on priority AOIs for Reforestamos.