Closed vincentsarago closed 2 years ago
@vincentsarago Can we also consider having the root mosaic
endpoint return a list of the mosaics and their name and mosaicid? This might assist in discoverability for some of the VEDA / Dashboard evolution work.
@sharkinsspatial
Can we also consider having the root mosaic endpoint return a list of the mosaics and their name and mosaicid?
Sure, but only if we go ahead with a new mosaic
table in the pgstac database. It might be fine for EOapi but I'm a bit worry. The goal of eoAPI submodules is to connect to any pgstac db. if we introduce a mosaic table this might close
some possibility.
Maybe https://github.com/stac-utils/titiler-pgstac/issues/30 is a better possibility. We could require
a specific metadata to be present (e.g type: Mosaic
) and use it as a filter value
cc @bitner
@bitner As we're expanding the use of pgstac-titiler
at NASA we have a few use cases where client applications will need to request information about all available mosaics in order to dynamically configure a list of available tile endpoints and their characteristics. With https://github.com/stac-utils/titiler-pgstac/pull/38 and several follow on PRs @vincentsarago is serializing the majority of the information we need. Is it feasible from a performance perspective to include a root mosaics
endpoint which would fetch, deserialize and return all mosaic hashes as is available with the current individual info
endpoint https://github.com/stac-utils/titiler-pgstac/blob/0f2b5b4ba50bb3458237ab21cf9a154d7b811851/titiler/pgstac/factory.py#L359-L367? cc @anayeaye @abarciauskas-bgse
I've made an addition PR in https://github.com/stac-utils/titiler-pgstac/pull/45
@sharkinsspatial let me know what you think!
Note, if we don't move forward with it in titiler-pgstac I'll totally add this in eoAPI anyway.
@sharkinsspatial EEEEK, I realllllly don't think you want to do that!
That endpoint lists every single search that has ever been made against the pgstac instance! If someone changes a date range, it's another record, etc.
For reference - Planetary Computer has over 4 million different records in the searches table!
I think it could be useful for something like seeing what people are searching on to debug things, but with no control over the searches that are getting recorded I don't see any possible world where it could be useful or scale to any reasonable amount as a "mosaic catalog". I'm not talking about performance here - it could perform just fine, it's more along the lines of I can't see how would you make any sense of it?
These mosaics are by design dynamic - a listing of "every dynamic thing that people can come up with" just doesn't seem right. It may be that I'm just missing something here, but I really don't see how this could be useful??? At least for the Planetary Computer, we are already seeing things in the logs where someone is setting up cron jobs that change the date range every so often and use that to grab new data -- someone could do this against a stac instance say every minute with each and every query being different, so being another record.
@bitner I totally get your point, but mosaic
are a little less dynamic and will often be more hard coded
search (e.g for static dataset like naip)
In https://github.com/stac-utils/titiler-pgstac/pull/45 what I'm proposing is that we filter
only search that have a specific metadata metadata.type = "mosaic"
which should narrow things down.
or maybe we could use stac directly 🤷 which means that we could create a mosaic
extension and store the mosaic info in a mosaic
collection.
{
"type": "Feature",
"stac_version": "1.0.0",
"stac_extensions": [
"https://stac-extensions.github.io/mosaic/v1.0.0/schema.json",
],
"id": "my search id",
"bbox": [
13.86148243891681,
36.95257399124932,
15.111074610520053,
37.94752813015372
],
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
13.876381589019879,
36.95257399124932
],
[
13.86148243891681,
37.942072015005024
],
[
15.111074610520053,
37.94752813015372
],
[
15.109620666835209,
36.95783951241028
],
[
13.876381589019879,
36.95257399124932
]
]
]
},
"properties": {
"datetime": "2021-02-21T10:00:17Z", // or null
"name": "my mosaic", // OPTIONAl: name of the mosaic
"stac_assets": ["image", "cog"] // OPTIONAl: List of available assets in each STAC records
},
"collection": "mosaics",
"assets": {
"true_color": {
"title": "True color Mosaic",
"href": "https://endpoint/{searchid}/{z}/{x}/{y}.jpeg",
"options": {
"assets": ["B4", "B3", "B2"],
"color_formula": "Gamma RGB 3.5 Saturation 1.7 Sigmoidal RGB 15 0.35",
}
},
"ndvi": {
"title": "NDVI Mosaic",
"href": "https://endpoint/{searchid}/{z}/{x}/{y}.jpeg",
"options": {
"expression": "(B4-B3)/(B4+B3)",
"rescale": "-1,1",
"colormap_name": "viridis",
}
}
},
"links": []
}
Note: if we prefer moving forward with a pure STAC solution it means that when the user register a search
it will have to also register a STAC item to the mosaic collection OR we will let titiler-pgstac
/register
endpoint do it 🤷♂️
@vincentsarago I see the point now on mosaics only being records with "mosaic" metadata. If nothing else, we would need to make sure to put an index on the searches table to make sure that the mosaics could be easily separated. Thant being said, I like your idea of a mosaic collection -- that further would allow us to use all the search mechanisms "for free" on any metadata that is stored as a mosaic item.
If we went the mosaic collection route, rather than having a /list endpoint it would just me /mosaics/items and would have search/filters as well as paging already in place.
re STAC way: I'm just a bit worry about creating a stac extension specific for titiler/titiler-pgstac. It seems to me that
put an index on the searches table to make sure that the mosaics could be easily separated
might just be easier 🙉
I do like the idea of modeling mosaic endpoints as STAC items (though as @vincentsarago noted, I don't like losing the consistency of all mosaic related requests occurring on the mosaic
path but that seems a small issue). If we do consider this approach a few thoughts/questions.
There is significant conceptual overlap with this and the existing extension proposals tiled assets, virtual assets and composite. Personally I think we can avoid alignment with tiled assets
as it would be overly verbose to advertise all of a mosaic's supported TileMatrixSets
and the dynamic nature of the mosaic's item composition makes maintaining the Tile Matrix Limits difficult. It might be worth considering aligning with the processing:expression
field for community consistency.
Should mosaic asset href
expose a url template (which is not a valid href
) or the link to the tilejson
? How much of this information should be packaged in asset
and how much should be packaged in the tilejson
? I'd lean towards packaging most of the descriptive information at the asset level and keeping tilejson
standardized and minimal.
It would be helpful to know what the current model that is being used for mosaic endpoint discovery by client applications. I took a quick look at https://github.com/microsoft/PlanetaryComputerDataCatalog but it might be good to know how the PC explorer is referencing the mosaics and how the application might like to leverage a discovery endpoint.
Enable retrieving mosaic by name instead of mosaicid