martinchristen / python-stac-api

MIT License
5 stars 0 forks source link

PyStac and swisstopo API #1

Open p1d1d1 opened 3 years ago

p1d1d1 commented 3 years ago

@martinchristen in the notebook you say that PyStac doesn't work with the swisstopo STAC API. Could you please tell more on this?

MatusGasparik commented 3 years ago

I'm still in the process of learning about the STAC API, but here is my take on this issue.

It seems to me that the swisstopo STAC API (link) doesn't quite follow the usual practice of organising and linking the STAC objects (or at least in a way that pystac expects it).

Starting with the "landing page" - https://data.geo.admin.ch/api/stac/v0.9/ - you get a JSON object that more or less corresponds to the STAC Catalog and can be read with pystac:

from pystac import Catalog

cat = Catalog.from_file('https://data.geo.admin.ch/api/stac/v0.9/')

However, the relation-types of the linked objects don't specify any child objects:

cat.to_dict()
{'id': 'ch',
 'stac_version': '0.9.0',
 'description': 'Data Catalog of the Swiss Federal Spatial Data Infrastructure',
 'links': [{'rel': 'root',
   'href': 'https://data.geo.admin.ch/api/stac/v0.9/',
   'type': 'application/json'},
  {'rel': 'self',
   'href': 'https://data.geo.admin.ch/api/stac/v0.9/',
   'type': 'application/json',
   'title': 'This document'},
  {'rel': 'service-doc',
   'href': 'https://data.geo.admin.ch/api/stac/static/spec/v0.9/api.html',
   'type': 'text/html',
   'title': 'The API documentation'},
  {'rel': 'service-desc',
   'href': 'https://data.geo.admin.ch/api/stac/static/spec/v0.9/openapi.yaml',
   'type': 'application/vnd.oai.openapi+yaml;version=3.0',
   'title': 'The OPENAPI description of the service'},
  {'rel': 'conformance',
   'href': 'https://data.geo.admin.ch/api/stac/v0.9/conformance',
   'type': 'application/json',
   'title': 'OGC API conformance classes implemented by this server'},
  {'rel': 'data',
   'href': 'https://data.geo.admin.ch/api/stac/v0.9/collections',
   'type': 'application/json',
   'title': 'Information about the feature collections'},
  {'rel': 'search',
   'href': 'https://data.geo.admin.ch/api/stac/v0.9/search',
   'type': 'application/json',
   'title': 'Search across feature collections',
   'method': 'GET'},
  {'rel': 'search',
   'href': 'https://data.geo.admin.ch/api/stac/v0.9/search',
   'type': 'application/json',
   'title': 'Search across feature collections',
   'method': 'POST'}],
 'title': 'data.geo.admin.ch'}

Hence, it is not amenable to discovery via the pystac's walk() or the get_children() methods:

list(cat.get_children())
[]

Looking at the link objects of the cat above, the one with the "data" relation (what should ideally be a "child" relation pointing to sub-catalog / collection) links to https://data.geo.admin.ch/api/stac/v0.9/collections which is a JSON list of STAC collections but itself not a valid STAC object:

next(cat.get_stac_objects(rel="data"))   # KeyError: 'id'  

You can read-in individual collections (supplying the correct URLs):

from pystac import STAC_IO

col = STAC_IO.read_stac_object("https://data.geo.admin.ch/api/stac/v0.9/collections/ch.swisstopo.landeskarte-farbe-10")
print(type(col))     # pystac.collection.Collection

However, the same issues related to the linking / naming the relation-types persist - so none of col.get_items() or col.get_children() actually works.

I don't know if this an issue of STAC spec version being 0.9.0 (I can't imagine the spec would be that much different) or if really the problem is implementation (swisstopo) versus expectation (pystac).

Maybe @martinchristen knows somebody at swisstopo who could elaborate on this?

martinchristen commented 3 years ago

Thanks @MatusGasparik this is what I noticed too.

Another "issue" is that you can't convert the output directly to a Python dictionary. I don't remember which one, but at least one STAC Python Module did the conversion directly using eval (instead of using the json module) and with the swisstopo STAC this conversion doesn't work.

While I really don't recommend using eval to convert JSON to a Python dict, this seems to work for other STAC servers...

p1d1d1 commented 3 years ago

Please consider that pystac strongly relies on the presence of "child" links: it is thus mainly for "static catalogs". The swisstopo implementation is based on the STAC API (which is a superset of the OGC API Features) and not on static catalogs. This means that "child" links are not mandatory and reference to "child collections" is guaranteed by the /collections endpoint, whose "rel" is "data". The swisstopo implementation is a valid STAC implementation: it is, e.g., indexed and works nicely under STAC Index: https://stacindex.org/catalogs/datageoadminch#/

martinchristen commented 3 years ago

I also tried https://github.com/brazil-data-cube/stac.py and another one I don't remember at the moment and they all didn't work.

We could do some bug reports/issues there.

p1d1d1 commented 3 years ago

IMHO all these STAC tools do not yet support reading collections information from the /collections endpoint (rel='data')

p1d1d1 commented 3 years ago

Both https://github.com/stac-utils/pystac-client and https://github.com/brazil-data-cube/stac.py will soon support reading collections from the /collection endpoint (rel='data'). See:

martinchristen commented 3 years ago

That is great news, thanks. I will create some examples in this repo here.