Microsoft Flood and Clouds benchmark dataset

We're implementing the Cloud to Street - Microsoft flood dataset for our first benchmark dataset to be used for linear probing and evalutation of finetuning on a downstream task.

The dataset consists of 2/3 of our pretext model's datacube inputs (Sentinel-1 and Sentinel-2) along with raster water mask labels for both sensors. The images are 512x512xC pixels. Ideally, we could have used the images as is, but that wasn't the case since 1) the Sentinel-1 VV and VH images underwent RTC with a different DEM than what was used for the Sentinel-1 product via the planetary computer STAC catalog, and 2) the Sentinel-2 images were L1C top of atmosphere instead of L2A surface reflectance. Therefore, we created a redux of the original datapipeline (see PR #75) used to create the training data for the pretext model to generate datacubes for the benchmark dataset using the geospatial bounds, timestamp (from the granule name). The datacubes generated have all three inputs matching the exact specs of the pretext model's training data, at 512x512 pixels.

The dataset lives on S3 at s3://clay-benchmark/c2smsfloods/, and specifically this processed datacube dataset is within s3://clay-benchmark/c2smsfloods/datacube/chips_512/. See the structure below for how the data is stored. Note (as of 12/8/2023, I hit a rate limit with my planetary computer API key (from a lot of use today) and was blocked from generating more than 43 datacubes. I'll have to try again tomorrow to see if I can get past this.

s3://clay-benchmark/c2smsfloods/
└───chips (original benchmark images, used for obtaining bounding boxes)
└───chips_512
│   └───flood event ID
│       │   datacube geotiff ending in _rtc.tif
│       │   label geotiff (same basename) ending in _rtc_LabelWater.tif

Here are some example benchmark datacubes:

S2A_L2A_20160812T041552_N0204_R090_T46RDP_20160812T042138_07694-03092_datacube LabelWater_S2A_L2A_20160812T041552_N0204_R090_T46RDP_20160812T042138_07694-03092_example S2A_L2A_20170628T055631_N0205_R091_T42SYB_20170628T060223_01031-07449_datacube S1A_IW_GRDH_1SDV_20170628T011519_20170628T011544_017227_01CBD9_0966_01031-07449_example S2A_L2A_20180919T102021_N0206_R065_T30PXR_20180919T142010_09316-08274_S1B_IW_GRDH_1SDV_20180918T181808_20180918T181833_012773_017943_rtc_datacube S2A_L2A_20160812T041552_N0204_R090_T46RDQ_20160812T042138_10456-09525_S1A_IW_GRDH_1SDV_20160812T234651_20160812T234716_012574_013B43_rtc_datacube

So, the first linear probing and finetuning task will be flood segmentation using this dataset. We'll implement a lightweight set of layers to achieve this and evaluate using standard metrics for segmentation (e.g. IoU, dice, F1).

Clay-foundation / model

Microsoft Flood and Clouds benchmark dataset #83