microsoft / PlanetaryComputer

Issues, discussions, and information about the Microsoft Planetary Computer
https://planetarycomputer.microsoft.com/
MIT License
178 stars 7 forks source link

Searching Sentinel 1 RTC produces different number of results with sort parameter #301

Open PCunninghamML opened 9 months ago

PCunninghamML commented 9 months ago

I'm running a search of Sentinel 1 RTC via URL which is filtered to roughly the lower 48 US states. I'm getting what appears to be a correct response that can be paged via next links to completion. However when sorting by title, the results of the second page are presented as the end of the dataset (# results less than limit param, no next link). I wouldn't think that adding a sort parameter ought to affect the number of features/items.

Searching "ALOS PALSAR Annual Mosaic" (alos-palsar-mosaic) with the same parameters and sorting appears to produce the expected results.

Thank you for your help.

First call with sorting by title:
https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313%2c22.917922936146%2c-63.7481689453125%2c50.6250730634144&sortby=%2bproperties.title
{
    "type": "FeatureCollection",
    "features": [
        { "snip": "250 items here" }
    ],
    "links": [
        {
            "rel": "next",
            "type": "application/geo+json",
            "method": "GET",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313,22.917922936146,-63.7481689453125,50.6250730634144&sortby=+properties.title&token=next:sentinel-1-rtc:S1B_IW_GRDH_1SSV_20161012T014253_20161012T014318_002467_004297_rtc"
        },
        {
            "rel": "root",
            "type": "application/json",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/"
        },
        {
            "rel": "self",
            "type": "application/json",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313%2c22.917922936146%2c-63.7481689453125%2c50.6250730634144&sortby=%2bproperties.title"
        }
    ]
}

Second call with sorting, URL taken verbatim from "next" link of first call:
https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313,22.917922936146,-63.7481689453125,50.6250730634144&sortby=+properties.title&token=next:sentinel-1-rtc:S1B_IW_GRDH_1SSV_20161012T014253_20161012T014318_002467_004297_rtc
{
    "type": "FeatureCollection",
    "features": [
        { "snip": "249 items here" }
    ],
    "links": [
        {
            "rel": "previous",
            "type": "application/geo+json",
            "method": "GET",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313,22.917922936146,-63.7481689453125,50.6250730634144&sortby=+properties.title&token=prev:sentinel-1-rtc:S1B_IW_GRDH_1SSV_20170516T133947_20170516T134012_005624_009D99_rtc"
        },
        {
            "rel": "root",
            "type": "application/json",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/"
        },
        {
            "rel": "self",
            "type": "application/json",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313,22.917922936146,-63.7481689453125,50.6250730634144&sortby=+properties.title&token=next:sentinel-1-rtc:S1B_IW_GRDH_1SSV_20161012T014253_20161012T014318_002467_004297_rtc"
        }
    ]
}

Repeated test without sorting produces expected results.
https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313%2c22.917922936146%2c-63.7481689453125%2c50.6250730634144
{
    "type": "FeatureCollection",
    "features": [
        { "snip": "250 items here" }
    ],
    "links": [
        {
            "rel": "next",
            "type": "application/geo+json",
            "method": "GET",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313,22.917922936146,-63.7481689453125,50.6250730634144&token=next:sentinel-1-rtc:S1A_IW_GRDH_1SDV_20231201T003639_20231201T003704_051454_0635BF_rtc"
        },
        {
            "rel": "root",
            "type": "application/json",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/"
        },
        {
            "rel": "self",
            "type": "application/json",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313%2c22.917922936146%2c-63.7481689453125%2c50.6250730634144"
        }
    ]
}

Subsequent API calls taken from "next" link of the prior response:
call 2:
https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313,22.917922936146,-63.7481689453125,50.6250730634144&token=next:sentinel-1-rtc:S1A_IW_GRDH_1SDV_20231201T003639_20231201T003704_051454_0635BF_rtc
{
    "type": "FeatureCollection",
    "features": [
        { "snip": "250 items here" }
    ],
    "links": [
        {
            "rel": "next",
            "type": "application/geo+json",
            "method": "GET",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313,22.917922936146,-63.7481689453125,50.6250730634144&token=next:sentinel-1-rtc:S1A_IW_GRDH_1SDV_20231124T221301_20231124T221326_051365_0632C4_rtc"
        },
        {
            "rel": "previous",
            "type": "application/geo+json",
            "method": "GET",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313,22.917922936146,-63.7481689453125,50.6250730634144&token=prev:sentinel-1-rtc:S1A_IW_GRDH_1SDV_20231201T003614_20231201T003639_051454_0635BF_rtc"
        },
        {
            "rel": "root",
            "type": "application/json",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/"
        },
        {
            "rel": "self",
            "type": "application/json",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313,22.917922936146,-63.7481689453125,50.6250730634144&token=next:sentinel-1-rtc:S1A_IW_GRDH_1SDV_20231201T003639_20231201T003704_051454_0635BF_rtc"
        }
    ]
}

call 3:
https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313,22.917922936146,-63.7481689453125,50.6250730634144&token=next:sentinel-1-rtc:S1A_IW_GRDH_1SDV_20231124T221301_20231124T221326_051365_0632C4_rtc
{
    "type": "FeatureCollection",
    "features": [
        { "snip": "250 items here" }
    ],
    "links": [
        {
            "rel": "next",
            "type": "application/geo+json",
            "method": "GET",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313,22.917922936146,-63.7481689453125,50.6250730634144&token=next:sentinel-1-rtc:S1A_IW_GRDH_1SDV_20231119T124040_20231119T124105_051286_06300A_rtc"
        },
        {
            "rel": "previous",
            "type": "application/geo+json",
            "method": "GET",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313,22.917922936146,-63.7481689453125,50.6250730634144&token=prev:sentinel-1-rtc:S1A_IW_GRDH_1SDV_20231124T221236_20231124T221301_051365_0632C4_rtc"
        },
        {
            "rel": "root",
            "type": "application/json",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/"
        },
        {
            "rel": "self",
            "type": "application/json",
            "href": "https://planetarycomputer.microsoft.com/api/stac/v1/search?collections=sentinel-1-rtc&limit=250&bbox=-128.787231445313,22.917922936146,-63.7481689453125,50.6250730634144&token=next:sentinel-1-rtc:S1A_IW_GRDH_1SDV_20231124T221301_20231124T221326_051365_0632C4_rtc"
        }
    ]
}

...and so on and so forth for thousands of more features/items
TomAugspurger commented 9 months ago

Thanks for the report, and apologies for not getting back to you earlier.

I thought this was https://github.com/stac-utils/pgstac/issues/177 (which is very similar) but we have that deployed and are still seeing this issue. We'll look into it and will get back to you.

For reference, here's the pystac-client version:

import pystac_client
catalog = pystac_client.Client.open("[https://planetarycomputer.microsoft.com/api/stac/v1")](https://planetarycomputer.microsoft.com/api/stac/v1%22))

a = catalog.search(
    collections="sentinel-1-rtc",
    # limit=250,
    bbox=[-128.787231445313, 22.917922936146, -63.7481689453125, 50.6250730634144],
    # sortby=["properties.title"],
    datetime="2020-01-01/2020-01-15"
)

b = catalog.search(
    collections="sentinel-1-rtc",
    # limit=250,
    bbox=[-128.787231445313, 22.917922936146, -63.7481689453125, 50.6250730634144],
    sortby=["title"],
    datetime="2020-01-01/2020-01-15"
)

aa = a.item_collection()
bb = b.item_collection()
assert len(aa) == len(bb), (len(aa), len(bb))

That's giving 943 vs. 499

gadomski commented 8 months ago

Similar/same issue reported on stac-fastapi: https://github.com/stac-utils/stac-fastapi/discussions/620#discussioncomment-7965235

PCunninghamML commented 7 months ago

Has the status on this issue changed? I noticed the linked stac-fastapi comment mentions the issue possibly originating in stac-utils/pgstac but I didn't see any open issues over there describing the same behavior.

TomAugspurger commented 7 months ago

No updates here yet. We'll make an issue on pgstac once we have a reproducer that's isolated to just it (if indeed that's where the problem is).