radiantearth / stac-spec

SpatioTemporal Asset Catalog specification - making geospatial assets openly searchable and crawlable
https://stacspec.org
Apache License 2.0
794 stars 178 forks source link

self-contained catalogs within a static catalog #1046

Closed matthewhanson closed 2 years ago

matthewhanson commented 3 years ago

The single-file-stac extension is a way to store multiple Items in a single file as an ItemCollection (aka GeoJSON FeatureCollection).

Recently I brought up the possibility of bringing ItemCollection into this spec rather than in the API spec, but it seemed strange because ItemCollection isn't used in this spec.

I've been working with large static catalogs and being able to incorporate an ItemCollection would be very useful and help solve a difficult scaling problem. When each Item is it's own file that has it's own links to and from parents and children, it is hard to manage millions of Items and even more difficult to save these into files.

The "items" rel type is used in an API to point to the set of Items, which share the same parent. Why not allow the use of items in a static catalog where type="FeatureCollection". If I could batch up 500 items into one file it greatly reduces the overhead required due to the number of files.

The single-file-stac extension needs some work before it can be published now that it cannot inherit from Catalog anymore due to the new type field, but I think it can be changed such that it could be used in a static catalog as well as it's own file.

m-mohr commented 3 years ago

@matthewhanson Is this a new extension or should we move this issue to the single-file-stac issue tracker?

matthewhanson commented 3 years ago

@m-mohr Actually I think this needs to be a new extension, it's not really a single-file-stac which is just an ItemCollection with collections. This is an ItemCollection but will require additional details on how links work within static catalogs to it.

m-mohr commented 3 years ago

From a provider perspective, this sounds useful, but from a client perspective, this sounds rather complex. If I want to retrieve a single Item, it seems like a lot of overhead to get it from an arbitrarily big chunk of e.g. 500 items. With JSON, which always needs to be read completely, this seems not cloud-native at all. There's no "index" that helps you just read what you want.

m-mohr commented 3 years ago

@matthewhanson You recently said SFS is obsolete? So should we close this, too?