intake / intake-stac

Intake interface to STAC data catalogs
https://intake-stac.readthedocs.io/en/latest/
BSD 2-Clause "Simplified" License
108 stars 25 forks source link

Slow creation with dynamic STAC catalogs #140

Open TomAugspurger opened 2 years ago

TomAugspurger commented 2 years ago

In https://github.com/intake/intake-stac/blob/b1451497a6ea40b939f340e397fafad082fcc47a/intake_stac/catalog.py#L123, intake-stac will recurse into child objects. This will end up making many HTTP requests for large, dynamic STAC catalogs served over a STAC API like https://planetarycomputer.microsoft.com/api/stac/v1.

I believe that all the necessary information is provided at the https://planetarycomputer.microsoft.com/api/stac/v1/collections endpoint. https://github.com/stac-utils/pystac-client handles all the logic for interacting with STAC APIs efficiently (it has subclasses for pystac.Collection, etc.). Something like

pystac_client.Client.open("https://planetarycomputer.microsoft.com/api/stac/v1/").get_collections()

should efficiently get the child collections for APIs that implement the /collections endpoint (I'm unsure about child catalogs; we don't use them).

Might have some overlap with https://github.com/intake/intake-stac/issues/66.