pangeo-data / pangeo-datastore-stac

STAC implementation of Pangeo Catalog
3 stars 0 forks source link

Pangeo STAC Catalog

Validate STAC catalogs

This repository contains a copy of Pangeo's cloud data catalog, formatted to follow the SpatioTemporal Asset Catalog (STAC) specification. The root STAC catalog can be found at:

https://raw.githubusercontent.com/pangeo-data/pangeo-datastore-stac/master/master/catalog.json

Currently the catalogs contain:

In time they should be able to hold:

Motivations

The motivation behind this project is to have a version of the current cloud data catalog which can be searched and browsed regardless of language. At the moment, the current YAML-based catalogs are only accessible through Python using intake. This means that any server-side code accessing these catalogs must be written in Python, which has historically played a big role in how we have generated the website containing previews of all catalogged data:

With the introduction of intake-stac, an intake extension which allows Python users to browse STAC catalogs, there is no longer a need to for the catalogs themselves to be tied to intake. Thus, a move to JSON-based STAC catalogs allows a variety of new languages (in particular JavaScript, Ruby, and PHP) access to the catalogs, without leaving behind initial Python users.

Guidelines (subject to change)

All of the Pangeo STAC catalogs are working with version 1.0.0-beta.2 of the STAC specification.

Currently, the Pangeo STAC catalog follows STAC specifications for an absolute published catalog. All preexisting intake catalogs correspond to STAC catalogs, while datasets and data collections correspond to STAC collections with extensions required to access the data being listed under the stac_extensions field.

Progress

There is still a lot of work to be done before this catalog can be considered equivalent to the current cloud data catalog. In particular: