Clay-foundation / model

The Clay Foundation Model (in development)
https://clay-foundation.github.io/model/
Apache License 2.0
242 stars 25 forks source link

Benchmark option for non-Sentinel high resolution data #187

Closed brunosan closed 1 month ago

brunosan commented 3 months ago

This competition to do land cover classification seems a great opportunity to test our model openly. It requiers ingestion on non-Sentinel data only in RGB, so we are not there yet.

https://cliffbb.github.io/OEM-Fewshot-Challenge/

image

* 7 regions from 44 countries across 6 continents at a spatial resolution of 0.25–0.5m ground sampling distance for global high-resolution land cover mapping
> * The 408 samples are also split into 258 as trainset, 50 as valset, and 100 as testset.
> File Structure and Content (All files are in `.tif` format):
-----------------------------------------------------------
1. **trainset.zip**:
    - Contains `images` and `labels` folders
    - `images` folder: 258 images of size 1024x1024 with a GSD (Ground Sampling Distance) of 0.6-1m.
    - `labels` folder: 258 segmentation masks of the images in the `images` folder.

2. **valset.zip**:
    - Contains `images` and `labels` folders
    - `images` folder: 50 images of size 1024x1024 with a GSD (Ground Sampling Distance) of 0.6-1m.
    - `labels` folder: 20 labels of the ``support set`` images in the `images` folder. The labels for
                       the 30 ``query set`` images in the `images` folder are withheld.
3. **testset.zip**:
    - Contains `images` and `labels` folders
    - `images` folder: 100 images of size 1024x1024 with a GSD (Ground Sampling Distance) of 0.6-1m.
    - `labels` folder: 20 labels of the ``support set`` images in the `images` folder. The labels for
                       the 80 ``query set`` images in the `images` folder are withheld.
 4. **train.txt**:
    - Contains a list of file names in the `trainset.zip`.

3. **val.json** and **test.json**:
    - Contains a list of file names the in the `valset.zip` and `testset.zip`, respectively. Below is
      the structure of the `val.json` and `test.json` files.
    - fnames = {
                 {"support_set": {8: ["filename_1.tif", "filename_2.tif", ...., "filename_5.tif"],
                                  9: ["filename_1.tif", "filename_2.tif", ...., "filename_5.tif"],
                                 10: ["filename_1.tif", "filename_2.tif", ...., "filename_5.tif"],
                                 11: ["filename_1.tif", "filename_2.tif", ...., "filename_5.tif"]},
                 {"query_set": ["filename_1.tif", "filename_2.tif", "filename_3.tif", ...
                                ...., 
                               "filename_n.tif"]}                   
               }

Land Cover Mapping Classes Strucure:
------------------------------------
1. **The `trainset`:
     classId2className = {
                          # ***Base classes***
                          1: 'tree',
                          2: 'rangeland',
                          3: 'bareland',
                          4: 'agric land type 1',
                          5: 'road type 1',
                          6: 'sea, lake, & pond',
                          7: 'building type 1'
                        }

2. **The `valset` and `testset`:
     classId2className = {
                          # ***Base classes***
                          1: 'tree',
                          2: 'rangeland',
                          3: 'bareland',
                          4: 'agric land type 1',
                          5: 'road type 1',
                          6: 'sea, lake, & pond',
                          7: 'building type 1'
                          # ***Novel classes***
                          8: '',
                          9: '',
                          10: '',
                          11: ''
                        }

      - The class names for the ***Novel classes*** depends on the data set.
        For the `valset`, the class names can be updated as:
                        {
                          8: 'road type 2',
                          9: 'river',
                          10: 'boat & ship',
                          11: 'agric land type 2'
                        }
yellowcap commented 1 month ago

Let's move these ideas to a discussion or a centralized place for choosing benchmarks and downstream applications.