stac-extensions / raster

Describes raster assets at band level (one or multiple) with specific information such as data type, unit, number of bits used, nodata.
Apache License 2.0
13 stars 7 forks source link

Add support for internal tiles #43

Open thomas-maschler opened 9 months ago

thomas-maschler commented 9 months ago

We are working on an update of the GDAL STACTA (tiled-assets) driver. Currently, the driver has to fetch metadata of at least one tiled asset to construct full metadata for the gridded dataset. This becomes an issue when working with sparse grids and the 0/0 tile is not present. The driver wouldn't know which other file to choose. The PR addresses this and makes use of raster:band and eo:band metadata to avoid this call in first place. However, there is some information needed that is currently not covered by any of the extensions. In particular information on the internal tile size (or block size in GDAL slang) is missing.

I'd like to propose a new field in the raster band object to support this use case.

There are various possible flavors and would like to seek some feedback prior to submitting a PR.

I could imagine the following:

A tuple describing the shape of the internal tiles

{
    ...,
    "internal_tile": {
        "type": "array",
        "title": "Internal tile shape",
        "description": "A tuple (width/height), specifying the internal block/ strip/ chunk shape of the raster band.",
        "prefixItems": [
            {
                "type": "integer"
            },
            {
                "type": "integer"
            }
        ]
    }
}

Or, to be more explicit, an object. This option would allow to add additional properties, suchs as whether or not the blocks are stored sparse.

{
   ...,
   "internal_tile": {
        "type": "object",
        "description": "This object allows to specify the internal tiling of a raster band, describing the block/ strip/ chunk shape and other storage information.",
        "required": [
            "width",
            "height"
        ],
        "properties": {
            "width": {
                "type": "integer",
                "description": "The width in pixels of the internal block/ strip/ chunk."
            },
            "height": {
                "type": "integer",
                "description": "The height in pixels of the internal block/ strip/ chunk."
            },
            "sparse": {
                "type": "boolean",
                "description": "Indicate whether or not bands are stored as sparse files. Sparse files are only compatible with GDAL driver."
            }
        }
    }
}
thomas-maschler commented 2 months ago

Discussed during STAC community meeting on Jul 1st 2024:

The preferred option is option 1, which describes everything at the extension's top level. ie two separate attributes one for block size, and the other for sparse. Instead of the name internal_tiles, using block_size is better to align with GDAL jargon.

block_size should only describe 2D chunk. The raster extension is meant for Rasters only. If there is a need to describe multi-dimensional arrays with nd-chunks the nd-array extension will be better suited.