Element84 / earth-search

Earth Search information and issue tracking
https://earth-search.aws.element84.com/v1
27 stars 2 forks source link

Missing `next` token when passed `sortby` #27

Closed gadomski closed 7 months ago

gadomski commented 7 months ago

The issue is described in detail in https://github.com/stac-utils/pystac-client/issues/629#issuecomment-1893849759, but td;dr: it looks like paging is broken if a sortby parameter is passed? I've checked for multiple sortby keys (properties.datetime and properties.eo:cloud_cover) and multiple collections (sentinel-2-l2a and sentinel-2-c1-l2a).

philvarner commented 7 months ago

https://github.com/stac-utils/stac-server/issues/608 ?

gadomski commented 7 months ago

Don't think so, because the items look like they have datetimes. Script:

from itertools import islice

from pystac_client import Client, ItemSearch

def summarize(item_search: ItemSearch) -> None:
    for i, page in enumerate(islice(item_search.pages_as_dicts(), 2)):
        print(f"Page {i}")
        next_link = next(link for link in page["links"] if link["rel"] == "next")
        next_token = next_link["body"].get("next")
        print(f"- Next token: {next_token}")
        item = page["features"][0]
        print(f"- First item datetime: {item['properties']['datetime']}")
        for item in page["features"]:
            datetime = item["properties"].get("datetime")
            if not datetime:
                print(f"Item with id={item['id']} is missing a datetime!!")
        print()

client = Client.open("https://earth-search.aws.element84.com/v1/")
intersects = {"type": "Point", "coordinates": [-105.1019, 40.1672]}
item_search_without_sortby = client.search(
    collections=["sentinel-2-l2a"],
    intersects=intersects,
)
item_search_with_sortby = client.search(
    collections=["sentinel-2-l2a"],
    intersects=intersects,
    sortby="properties.eo:cloud_cover",
)

print("=== Without sortby ===")
summarize(item_search_without_sortby)

print("\n=== With sortby ===")
summarize(item_search_with_sortby)

Output:

=== Without sortby ===
Page 0
- Next token: 2023-12-17T18:02:51.569000Z,S2A_13TDE_20231217_0_L2A,sentinel-2-l2a
- First item datetime: 2024-01-08T17:52:57.926000Z

Page 1
- Next token: 2023-11-24T17:52:54.004000Z,S2A_13TDE_20231124_0_L2A,sentinel-2-l2a
- First item datetime: 2023-12-14T17:52:54.900000Z

=== With sortby ===
Page 0
- Next token: None
- First item datetime: 2021-05-06T18:02:52.913000Z

Page 1
- Next token: None
- First item datetime: 2021-05-06T18:02:52.913000Z
philvarner commented 7 months ago

resolved. running the code above gives:

Without sortby
Next token: 2023-12-29T17:53:00.297000Z,S2B_13TDE_20231229_0_L2A,sentinel-2-l2a
Next token: 2023-12-04T17:52:52.234000Z,S2A_13TDE_20231204_0_L2A,sentinel-2-l2a

With sortby
Next token: 0.001052
Next token: 0.005294