This repo contains scripts to manage training data, workflow to create Azure ML stack and train new models that are compatible to be run on the PEARL Platform. It is based on the work on Caleb Robinson of Microsoft.
Monitor experiments and training runs on Azure ML
Training Repo
DeepLabv3Plus Architecture + focal loss seems most promising approach
How/Why we create Seed Data
There are two options to create the training dataset.
Option 1. Feed LULC labels data in GeoTiff format.
naip-label-align.py and NAIPTileIndex.py provided functions on how to:
Notes:
rtree
which is not installed automaticaly)
brew install spatialindex
These CSVs can be deployed to AML for model training direction. Instruction will be given in the following section.
python naip-label-align.py
--label_tif_path sample.tif
--out_dir <dir-name>/
--threshold [0.0 to 1.0]
--aoi <aoi-name>
--group <group-name>
Option 2. LULC labels available as GeoJSON (vector) files, and rasterization is required.
Firstly, NAIP imagery that overlap with LULC label data is needed to be downloaded before the rasterization task. naip_download_pc.ipynb provides script and documentation on how you can download NAIP imagery to your AOI from MS Plentary Computer.
Secondly, LULC label rasterization functions and steps provided in label_rasterize.ipynb The rasterization in the order of (tree canopy on the top of the lulc layer or burn last, other_impervious on the bottom or it should be rasterized first in the order)
tree_canopy
building
water
bare_soil
roads_railroads
grass_shrub
other_impervious
Details see the notebook.
If you are going to use AML to train LULC models for the first time, please go through these steps.
This code was tested using python 3.6.5
Create a conda environment using .pytorch-env.yaml
file and execute the scripts from the created environment.
You will need to set the following variables in your .env
bash
AZ_TENANT_ID=XXX #az account show --output table
AZ_SUB_ID=XXX #az account list --output table
AZ_WORKSPACE_NAME=XXX #User set
AZ_RESOURCE_GROUP=XXX #User set
AZ_REGION=XXX #User set
AZ_GPU_CLUSTER_NAME=XXX #User set
AZ_CPU_CLUSTER_NAME=XXX #User set
Then export all variables to your environment:
export $(cat .env);
train_azure/create_workspace.py after export your Azure credentials, this script will create AML workspace.
This script will create GPU compute resources to your workspace on AML.
This script will create GPU compute resources to your workspace on AML.
We have three PyTorch based Semantic Segmenation models ready for LULC model trainings, FCN, UNet and DeepLabV3+.
To train a model on AML, you will need to define or parse a few crucial parameters to the script, for instance:
TODO: Will we be providing sample csv
config = ScriptRunConfig(
source_directory="./src",
script="train.py",
compute_target=AZ_GPU_CLUSTER_NAME,
arguments=[
"--input_fn",
"sample_data/indianapolis_train.csv",
"--input_fn_val",
"sample_data/indianapolis_val.csv",
"--output_dir",
"./outputs",
"--save_most_recent",
"--num_epochs",
20,
"--num_chips",
200,
"--num_classes",
7,
"--label_transform",
"uvm",
"--model",
"deeplabv3plus",
],
)
These parameters are to be configure by the user. input_fn_X
paths should be provided by the user, and are the outputs of the data generation step (NAIP Label Algin) described above.
python train_azure/run_model.py
To compute Global F1, and class base F1 scores (written in CSV) from a trained model over latest dataset. You can use this eval script as an example.
python train_azure/run_eval.py
After a best performing model is selected, seed dataseed need to be created to serve PEARL. Seed Data is the model embedding layers from the trained model that is used together with users inputs training data in PEARL retraining session.
run_seeddata_creation.py will config AML and use the main seeddata creation script to create seed data for the trained best performing model.
python train_azure/run_seeddata_creation.py
LULC Class distribution is a graph showing the porpotion of LULC pixel numbers for a trained model on PEARL. See the bar chart bellow.
train_azure/run_cls_distrib.py will guide you how to compute the classes distribution from the training dataset for the model.
python train_azure/run_cls_distrib.py