developmentseed / stac-explorer

An explorer (visualizing + simple charting) for data catalogued in STAC and CMR
4 stars 1 forks source link

Define schema and logic for routing from STAC collection to titiler #2

Closed abarciauskas-bgse closed 6 months ago

abarciauskas-bgse commented 8 months ago

Need

The UI application will need a way to route tile requests to the correct titiler instance for a given collection. At this time, we assume all collections will have a STAC endpoint. Ideally, we can determine what titiler instance to use based on the STAC endpoint, but this is limiting since we would have to pre-define which titiler instances the application would route requests to.

Being forward looking, we want to make it easy for other STAC systems to use this UI code, so the configuration and logic should allow for the addition of other STAC catalogue endpoints which, presumably, may be associated with their own dynamic tiler instance, which may or may not be titiler-pgstac.

Proposal

We define a schema for catalogues and collections which includes a few additional values to inform the application as to how to route requests.

Here is the proposed schema:

Catalog:
  type: object
  properties:
    endpoint:
      type: string
      format: uri
    isPgstac:
      type: boolean
    titilerEndpoint:
        type: string
        format: uri

Collection:
  type: object
  properties:
    catalog:
      type: string
    id:
      type: string

Catalogs:
  type: object
  additionalProperties:
    $ref: '#/Catalog'

Collections:
  type: array
  items:
    $ref: '#/Collection'

type: object
properties:
  catalogs:
    $ref: '#/Catalogs'
  collections:
    $ref: '#/Collections'

And here is an example:

catalogs:
  veda:
    endpoint: "https://staging-stac.delta-backend.com"
    titilerEndpoint: "https://staging-raster.delta-backend.com"
    isPgstac: true
  maap:
    endpoint: "https://stac.maap-project.org"
    titilerEndpoint: "https://titiler.maap-project.org"
    isPgstac: false

collections:
  - catalog: "veda"
     id: "mursst-cmr"
  - catalog: "veda"
     id: "combined_CMIP6_daily_GISS-E2-1-G_tas_kerchunk_DEMO" 
  - catalog: "veda"
     id: "nightlights-500m-daily" 
  - catalog: "maap"
     id: "icesat2-boreal"

There will also be 2 other titiler instances. At this time we only anticipate one of each, so they may be configured as environment variables:

The logic to determine which titiler endpoint to use will be as follows:

  1. Make a request to the collection endpoint
const data = yaml.load('file.yaml')
const catalogs = data['catalogs']
const collections = data['collections']
# say selected.collection = { id: "mursst-cmr", catalog: "veda" }
collection_data = fetch(`${catalogs[selected.collection.catalog].endpoint}/collections/${selected.collection.id}`)
  1. If the collection data includes a collection concept id --> make a tile request to TITILER_CMR
  2. If the collection data includes an asset type "zarr" --> make a request to titiler-xarray (note, there will need to be additional logic when the collection data includes multiple assets for different groups, in the case of pyramids)
  3. If the collection data does not include either of those, make a request to the titiler instance associated with the STAC catalog. a. If the titiler is not a pgSTAC instance, a request to the STAC's /items endpoint will be required to determine which assets to send to the titiler instance. b. If the titiler IS a pgSTAC instance, then we can use the mosaic register endpoint with the search parameters, like in the current VEDA UI.

@oliverroick @sharkinsspatial @vincentsarago let me know if 👆🏽 makes sense to you, or you have other ideas, I am open.

sharkinsspatial commented 8 months ago

@abarciauskas-bgse I think we might be able to start a bit simpler than this and then add deeper configuration as needed. As discussed in the Slack channel we can probably initially avoid using API interactions in favor of static config files and static STAC collection files to ease development friction. I think we can keep the tiler routing configuration directly in the collection mapping so I don't think we need a Catalog type in this case. A simple config file with

{ 
  "gpm-imerg": {
    "collection": "http://somehost/gpm-imerg_collection.json", 
    "tiler": "http://titiler-cmr"
  },
  "hls-l30": {
     "collection": "http://somehost/hls-l30_collection.json",
     "tiler": "http://titiler-cmr"
  },
  "somezarr": {
     "collection": "http://somehost/somezarr_collection.json",
     "tiler": "http://titiler-xarray"
  }
}

I'd prefer to have explicit configuration for tiler routing control rather than logic that needs to be embedded in the front end code.

In the case of a "zarr" that has pyramids stored in different archives, we should strive to include all pyramid groups in a single datatree accessible entrypoint so that titiler-xarray only needs to be able to open a datatree and use the agreed upon zarr pyramid convention to determine which group to use at which z-level.

Are we trying to cover use case 4.a in this application? I don't see an immediate need for this but I might be missing something.

abarciauskas-bgse commented 8 months ago

@sharkinsspatial that makes sense to me! Thanks for simplifying things. I also agree

In the case of a "zarr" that has pyramids stored in different archives, we should strive to include all pyramid groups in a single datatree accessible entrypoint so that titiler-xarray only needs to be able to open a datatree and use the agreed upon zarr pyramid convention to determine which group to use at which z-level.

abarciauskas-bgse commented 6 months ago

closing as completed