HLS STAC metadata with renders parameters

abarciauskas-bgse commented 7 months ago

Background + Motivation

We want to have an example of how to use this UI and titiler-cmr with raster data, and HLS is a good candidate for that. This issue is to determine what the STAC metadata should look like for the following collections in CMR.

The STAC metadata only needs to be for the collection, as titiler-cmr will query CMR for granules using the collection concept ID. The STAC metadata should include all relevant metadata from the collection as published in CMR and some options for the renders extension.

Assumptions

We don't need to include the band information in the collection itself, but the renders object will refer to the bands in each granule.

Collection + Item references and examples

HLSL30

HLSL30 on Earthdata Search

Here is an example of an L30 granule's STAC metadata to get a sense of what bands are available.

HLSS30

HLSS30 on Earthdata Search

Here is an example of an S30 granule's STAC metadata

https://nasa-impact.github.io/veda-docs/notebooks/quickstarts/hls-visualization.html gives us some existing examples:

S30 NDVI s30_vegetation_index_expression = "(B08_b1-B04_b1)/(B08_b1+B04_b1)" s30_vegetation_index_rescaling = "0,1" s30_vegetation_index_colormap = "rdylgn"

L30 NDVI l30_ndwi_expression = "(B03_b1-B05_b1)/(B03_b1+B05_b1)" l30_ndwi_assets = ["B03", "B05"] l30_ndwi_rescaling = "0,1" l30_ndwi_colormap = "spectral"

References:

https://github.com/stac-extensions/render
https://github.com/developmentseed/stac-explorer/blob/main/notebooks/create-stac-for-cmr-collections.ipynb provides an example of extracting collection metadata from CMR into a STAC representation.

Questions we need to answer to produce `renders` metadata:

Do any of the none B* bands need render options on their own?
Are there already default color maps and rescale values for existing bands or indices?
What indices should we include? I would guess real color, false color, NDVI and NDWI
- real color: red + green + blue
- false color: ?
- NDVI: (NIR - Red) / (NIR + Red)
- NDWI: (Green−NIR) / (Green+NIR) or (NIR−SWIR) / (NIR+SWIR)

sharkinsspatial commented 7 months ago

@abarciauskas-bgse @vincentsarago I started some investigation on this today. One initial issue that jumps out is the lack of asset differentiation in the CMR UMM-G metadata model. UMM-G treats the files that make up a granule as a list of RelatedURLs rather than a dictionary of assets with well defined keys. earthaccess exposes the RelatedURLs portion of the UMM-G data model as data_links.

This creates an issue with how we use render extension as we don't have a direct way of linking the asset keys used in the the render object to the list index of the data_links so we don't know which link corresponds to which logical asset.

We have a few options for dealing with this at the titiler-cmr level.

Use naive assumptions about the RelatedURL. This is the approach that stac-cmr uses but I don't know if this pattern (where the final . separated segment of the file name denotes a logical asset key is universally applicable to data in CMR, we'll need to check with someone who knows more.
The other option is including another STAC extension or property that explicitly defines the lookup pattern for values in the data_links. This would be regex or glob pattern where the asset key could be substituted (regex{asset_key}.tiff) that would then be passed as query parameter to titiler-cmr and used in conjunction with the keys in the url assets parameter to look up the correct link in the data_links list.

Not sure what the best option is for this. If 1 is true and we can assume a common pattern across CMR, it might be worth moving this functionality upstream to earthaccess so that it optionally can return a Dict of links with the appropriate keys.

abarciauskas-bgse commented 7 months ago

Thanks @sharkinsspatial, this is a great write up. I am asking about this in a slack channel but have also opened an issue in https://github.com/nasa/cmr-stac/issues/326

abarciauskas-bgse commented 7 months ago

Drew Pesall responded:

The answer to your question is a bit complicated, because not all collections adhere to this schema. I can't really estimate how many collections adhere to this schema, so for some collections with relatedUrls it looks a bit goofy because the selected string to be used as the key is not particularly informative, but it still generates the key either way.

so I think this method sounds pretty unreliable.

abarciauskas-bgse commented 6 months ago

closing as completed

developmentseed / stac-explorer