Open matthewhanson opened 3 years ago
I have questions. First, would this be enough to support your use case, @matthewhanson?
import pystac_client
from pystac_client import Client
client_a = Client.open("http://stac-api-a.test")
client_b = Client.open("http://stac-api-b.test")
search_a = client_a.search(collections=["foo"], datetime="2023-06-07")
search_b = client_b.search(collections=["bar"], datetime="2023-06-07")
items = search_a.item_collection()
items.extend(search_b.item_collection())
If that's enough, then we just need to add an .extend()
method to ItemCollection
in pystac.
If that's not enough, I'm at a bit of a loss. Each STAC API tends to be so different that it doesn't seem realistic to, e.g., use the same collection IDs across clients. If you want to re-use the same set of parameters, it's pretty trivial to do this:
query = {
"datetime": "2023-06-07",
"bbox": [-73.21, 43.99, -73.12, 44.05],
}
items = client_a.search(collections=["foo"], **query).item_collection()
items.extend(client_b.search(collections=["bar"], **query).item_collection())
@matthewhanson, an you sketch out what you had in mind, if it's more than what I've described?
The important thing here would be to ensure that if an order was specified in the search that the results would be interleaved based on that order.
Quick and dirty proof of concept for a federated search that merges records according to their sortby settings.
from pystac_client import Client
import morecantile
import heapq
from functools import reduce, cmp_to_key
dot_get = lambda p, d: reduce(dict.get, p.split('.'), d)
def ogc_sort_func(sorts, a, b, depth=0):
sort = sorts[depth]
# print(sort, depth)
field = sort.get('field')
direction = sort.get('direction','asc')
desc = 1 if direction.lower()[0] == 'd' else -1
# print(field, direction)
av = dot_get(field,a)
bv = dot_get(field,b)
# print(av, bv, av==bv)
if (av is None and bv is None) or av == bv:
# print('stepping through', sorts, a, b)
return ogc_sort_func(sorts, a, b, depth=depth+1)
elif av is None:
out = -1
elif bv is None:
out = 1
elif av < bv:
out = 1
else:
out = -1
return desc * out
tms = morecantile.tms.get("WebMercatorQuad")
x, y, z = tms.tile(-93,45,5)
bbox = list(tms.bounds(morecantile.Tile(x, y, z)))
print(bbox)
sortby = [{"field":"properties.datetime","direction":"desc"},{"field":"id","direction":"desc"}]
datetime=["2020-10-10","2020-10-10T18:00:00Z"]
catalog = Client.open('https://planetarycomputer.microsoft.com/api/stac/v1')
results = catalog.search(
limit=100,
max_items=1000,
bbox=bbox,
collections=["naip"],
datetime=datetime,
sortby=sortby
)
a=results.items_as_dicts()
results = catalog.search(
limit=100,
max_items=1000,
bbox=bbox,
datetime=datetime,
collections=["landsat-c2-l2"],
sortby=sortby
)
b=results.items_as_dicts()
results = catalog.search(
limit=100,
max_items=1000,
bbox=bbox,
datetime=datetime,
collections=["sentinel-2-l2a"],
sortby=sortby
)
c=results.items_as_dicts()
keyfunc = lambda l, r: ogc_sort_func(sortby, l, r)
print('merging')
g=heapq.merge(a,b,c, key=cmp_to_key(keyfunc))
print('cycling')
for i in range(100):
row=next(g)
print(dot_get('properties.datetime', row), row.get('id'),row.get('collection') )
For that, I did the sorting just on the items as dicts, but if we were to actually implement this, you could use Items as classes and either create a new subclass or monkeypatch a lt method onto it.
A big advantage of STAC is being able to use data from multiple sources. It would be a nice feature to be able to search multiple STAC endpoints and combine the results into a single FeatureCollection