Closed ben-epoch-blue closed 5 days ago
This is pretty easy to do with the current API:
item_collections = []
for collection in ("datasetA", "datasetB", "datasetC"):
item_collections.append(catalog.search(collections=collection, bbox=bbox).item_collection())
Client.search
is meant to represent as single interaction with the /search
endpoint of a STAC API, and the output of pystac-client (a single flattened list) directly reflects the outputs of a spec-compliant STAC API server: https://github.com/radiantearth/stac-api-spec/tree/release/v1.0.0/item-search#response. Specifically:
The response to a request (GET or POST) to the search endpoint must always be an ItemCollection object - a valid GeoJSON FeatureCollection that consists entirely of STAC Item objects.
Note that it does not allow for a list of items, which is what you're describing.
This is the output of a search on multiple datasets:
source = catalog.search(collections=["nasadem", 'esa-worldcover', "jrc-gsw"], bbox=Point([..., ...]).buffer(900*30/113200, cap_style=3).bounds).item_collection()
type "FeatureCollection"
features[] 7 items
0
type "Feature"
stac_version "1.0.0"
id "ESA_WorldCover_10m_2021_v200_N03E099"
properties
geometry
links[] 5 items
assets
bbox[] 4 items
stac_extensions[] 4 items
collection "esa-worldcover"
1
type "Feature"
stac_version "1.0.0"
id "ESA_WorldCover_10m_2021_v200_N00E099"
properties
geometry
links[] 5 items
assets
bbox[] 4 items
stac_extensions[] 4 items
collection "esa-worldcover"
2
type "Feature"
stac_version "1.0.0"
id "ESA_WorldCover_10m_2020_v100_N03E099"
properties
geometry
links[] 5 items
assets
bbox[] 4 items
stac_extensions[] 4 items
collection "esa-worldcover"
...
There are 4 images for ESA WorldCover, 2 images for NASADEM, and 1 image for GSW-JSW. All 3 datasets are output into a single FeatureCollection
rather than 3 distinct FeatureCollection
objects - can you confirm if this is intended?
If so, is there a way to differentiate between the different datasets returned from the search?
Yes, that's intended.
You can use itertools.groupby
to group by collection
by_collection = {k: list(v) for k, v in itertools.groupby(sorted(search.item_collection(), key=lambda x: x.collection_id), key=lambda x: x.collection_id)}
I am trying to search for multiple collections at once, and assign them into variables
a, b, c = catalog.search(collections=["datasetA", "datasetB", "datasetC"], bbox=bbox).item_collection()
However, when
bbox
intersects multiple images, then multiple values are returned. The output I would like is this:a, b, c = [[a1, a2, a3], [b1], [c1, c2]]
But the current output is a flattened list which cannot be predictably unpacked:
a, b, c = a1, a2, a3, b1, c1, c2
--> Error