radiantearth / stac-spec

SpatioTemporal Asset Catalog specification - making geospatial assets openly searchable and crawlable
https://stacspec.org
Apache License 2.0
788 stars 179 forks source link

Mosaic / composite item type #150

Closed mojodna closed 1 year ago

mojodna commented 6 years ago

Grouped images should be representable within a STAC Catalog. These may range from multiple parts of a DigitalGlobe strip (multiple assets, but logically and effectively treated as a single asset alongside additional (metadata, thumbnail, etc.) assets) to a curated collection of items that an entity wishes to share.

Properties of an element within such a collection of assets:

In the case of components of a DG strip, this would be included in the list of assets (ideally with some indication that it should be used in preference to individual components, perhaps using q values: "Quality factors allow the user or user agent to indicate the relative degree of preference for that media-range, using the qvalue scale from 0 to 1 (section 3.9). The default value is q=1.")

A curated collection of items potentially equates to a STAC Item in terms of usage, so this is something to reconcile.

Many of these things provide hints for display purposes and aren't necessarily descriptors for the item itself, so that also merits consideration.

The combination of resolution ranges and quad keys allows a tiler to know when a given asset should be included (and in what order) in a composite image.

We envision this as a small-ish JSON file that is HTTP-accessible and can be used in place of a COG URL (or POSTed) with something like tiles.rdnt.io.

/cc @sharkinsspatial

matthewhanson commented 6 years ago

@mojodna Most of these fields look to me like they could apply to any EO data - for example NODATA and gain/offset values. This is actually a problem right now when dealing with landsat data since each band has different gain/offsets to get to TOA' (note not true TOA but TOA without sun angle correction), and that info is only available through the MTL metadatata file, not in the STAC item.

simonff commented 6 years ago

Yes, none of these are mosaic-specific. We will add them to the raster extension(s) of the Dataset spec.

@mojodna : What's a sample use case for quality factors?

@matthewhanson : For reference, in EE almost all properties from the MTL file are stored in each Landsat asset's metadata, and thus TOA can be computed on the fly. In Collection 1 average sun angle can be taken into account: https://landsat.usgs.gov/using-usgs-landsat-8-product

However, it's not clear if storing them in the STAC catalog is necessary if STAC is intended just for listing/retrieving/visualizing assets and not for providing input for computations.

matthewhanson commented 6 years ago

@simonff The problem we have run into is, in the case of Landsat, in order to visualize it you need to apply the gains and offsets and it makes it easier if those gains and offsets are in the STAC record rather than in the datafile because that requires you need to read the metadata from the header of the files which is more overhead (we are reading just windowed pieces of the files remotely from S3, so the overhead of reading additional metadata is not small).

In the case of sun angle for Landsat there are two problems: 1 - this special case of handling Landsat means specific processing code just for Landsat, whereas if it were already in TOA reflectance (or surface reflectance) you can use the same processing code as for Sentinel and other sensors. We're currently working with USGS and pushing for them to distribute it as such because right now many people are using Landsat data incorrectly because they aren't correcting it. 2 - While you can use average sun angle (ie scene center angle) it is not ideal when you visualize two adjacent rows in the same path. You will see an artifact at the scene border. The sun angle really should be calculate per pixel and applied as an array.

matthewhanson commented 6 years ago

@simonff Also, while some of these fields do apply to the dataset as a whole, some of them (such as gain/offset) would be per Item as they can change across scenes.

vincentsarago commented 5 years ago

πŸ‘‹ @mojodna @matthewhanson I'd love to see this moving.

About the proposed properties, IMO (and for my use cases) the most important is to have the zoom range and the quadkey coverage for each item.

Fee comments about the proposed items:

Not sure what it means

πŸ‘ (or resolution + number of overviews, if present)

πŸ‘ Quadkey is perfect. Having the full list of quadkey might be a bit expensive (in processing/storage/response) so maybe the list of quakdey at the lowest resolution.

😐 I see this as optional and is implementation specific IMO.

😐 I see this as optional and is implementation specific IMO.

😐 I see this as optional and is implementation specific IMO.

😐 I see this as optional and is implementation specific IMO.

πŸ‘

With our recent work on COG mosaics https://medium.com/devseed/cog-talk-part-2-mosaics-bbbf474e66df we use quadkey indexes intermediate files to link a tile request to a COG so having a quadkey info directly in the stac metadata will make it easier to create those.

palmerj commented 4 years ago

Any more interest in something like this?

I'm interested in an extension that provides for a collection of geotiff files that make up a Mosaic dataset. Looking at the existing data model I think a the existing STAC collection almost provides this. I would be interested in the following additional fields:

Collection are already supporting the following fields I need:

palmerj commented 4 years ago

I would also be interesting in having a geometry for the footprint of the mosaic dataset, but understand that might not be possible with Collections not being GeoJSON features.

palmerj commented 4 years ago

Also just came across this catalog layout best practise:

Items should be stored in subdirectories of their parent catalog. This means that each item and its assets are contained in a unique subdirectory

I think in regards to mosaics it best that tile items are not stored in subdirectories. I understand this practise for very large catalogue of datasets that contain one tiff file per band.

palmerj commented 4 years ago

Anyone interested in this?

m-mohr commented 4 years ago

@palmerj From previous work on other extensions, it's often a good idea to just start a draft and put it up as PR. Afterwards, we can get people to review it, asking explicitly for help from domain experts and asking for help on Twitter. Asking here may not get you enough attention.

m-mohr commented 1 year ago

There's an extension now: https://github.com/stac-extensions/composite Please continue the discussion there.