Healy-Hyperspatial / stac-fastapi-mongo

Mongodb backend for stac-fastapi built on the stac-fastapi-elasticsearch core api library.
MIT License
6 stars 1 forks source link

Fix pagination with mongo #1

Closed jonhealy1 closed 6 months ago

jonhealy1 commented 7 months ago

There are tests skipped in test_item.py because pagination in mongodb is not working properly.

pedro-cf commented 6 months ago

Do you have any directions on how to approach this bugfix?

jonhealy1 commented 6 months ago

I made a small fix, so the pagination should have some functionality now ....

If you search this route let's say: http://localhost:8084/collections/test-collections/items&limit=1

You should see in links, a "next" link that will lead you to the next page of results

  "links": [
        {
            "rel": "next",
            "type": "application/json",
            "method": "GET",
            "href": "http://localhost:8084/collections/test-collection/items?token=UzJCXzFDQ1ZfMjAxODEyMjNfMF9MMkE="
        },
        {
            "rel": "root",
            "type": "application/json",
            "href": "http://localhost:8084/"
        },
        {
            "rel": "self",
            "type": "application/json",
            "href": "http://localhost:8084/collections/test-collection/items"
        }
    ],
    "context": {
        "returned": 10,
        "limit": 10,
        "matched": 100
    }
jonhealy1 commented 6 months ago

There was something about how Elasticsearch does pagination using tokens that wasn't the Mongo way to do things - I can't remember exactly now

jonhealy1 commented 6 months ago

If you look here: https://github.com/Healy-Hyperspatial/stac-fastapi-mongo/blob/6ea2060b7c048e04d15b774ec4d7f4a94f07ce52/stac_fastapi/mongo/database_logic.py#L520-L581

pedro-cf commented 6 months ago

Hi, I did some tests with commit e59c421c816abe69ced82b472e3b1b837bf6495f

1. Created 1 collection (test-collection)":

curl --request POST --url http://localhost:8084/collections --header 'Content-Type: application/json' --data '{"id": "test-collection", "stac_extensions": ["https://stac-extensions.github.io/eo/v1.0.0/schema.json"], "type": "Collection", "description": "Landat 8 imagery radiometrically calibrated and orthorectified using gound points and Digital Elevation Model (DEM) data to correct relief displacement.", "stac_version": "1.0.0", "summaries": {"platform": ["landsat-8"], "instruments": ["oli", "tirs"], "gsd": [30]}, "extent": {"spatial": {"bbox": [[-180.0, -90.0, 180.0, 90.0]]}, "temporal": {"interval": [["2013-06-01", null]]}}, "links": [{"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "self", "type": "application/json"}, {"href": "http://localhost:8081/", "rel": "parent", "type": "application/json"}, {"href": "http://localhost:8081/collections/landsat-8-l1/items", "rel": "item", "type": "application/geo+json"}, {"href": "http://localhost:8081/", "rel": "root", "type": "application/json"}], "title": "Landsat 8 L1", "keywords": ["landsat", "earth observation", "usgs"]}'

2. Created 9 Items (test-item1...9):

curl --request POST --url http://localhost:8084/collections/test-collection/items --header 'Content-Type: application/json' --data '{"type": "Feature", "id": "test-item1", "stac_version": "1.0.0", "stac_extensions": ["https://stac-extensions.github.io/eo/v1.0.0/schema.json", "https://stac-extensions.github.io/projection/v1.0.0/schema.json"], "geometry": {"coordinates": [[[152.15052873427666, -33.82243006904891], [150.1000346138806, -34.257132625788756], [149.5776607193635, -32.514709769700254], [151.6262528041627, -32.08081674221862], [152.15052873427666, -33.82243006904891]]], "type": "Polygon"}, "properties": {"datetime": "2018-02-12T12:30:22Z", "landsat:scene_id": "LC82081612020043LGN00", "landsat:row": "161", "gsd": 15, "landsat:revision": "00", "view:sun_azimuth": -148.83296771, "instrument": "OLI_TIRS", "landsat:product_id": "LC08_L1GT_208161_20200212_20200212_01_RT", "eo:cloud_cover": 0, "landsat:tier": "RT", "landsat:processing_level": "L1GT", "landsat:column": "208", "platform": "landsat-8", "proj:epsg": 32756, "view:sun_elevation": -37.30791534, "view:off_nadir": 0, "height": 2500, "width": 2500}, "bbox": [149.57574, -34.25796, 152.15194, -32.07915], "collection": "test-collection", "assets": {}, "links": [{"href": "http://localhost:8081/collections/landsat-8-l1/items/LC82081612020043", "rel": "self", "type": "application/geo+json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "parent", "type": "application/json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "collection", "type": "application/json"}, {"href": "http://localhost:8081/", "rel": "root", "type": "application/json"}]}'

curl --request POST --url http://localhost:8084/collections/test-collection/items --header 'Content-Type: application/json' --data '{"type": "Feature", "id": "test-item2", "stac_version": "1.0.0", "stac_extensions": ["https://stac-extensions.github.io/eo/v1.0.0/schema.json", "https://stac-extensions.github.io/projection/v1.0.0/schema.json"], "geometry": {"coordinates": [[[152.15052873427666, -33.82243006904891], [150.1000346138806, -34.257132625788756], [149.5776607193635, -32.514709769700254], [151.6262528041627, -32.08081674221862], [152.15052873427666, -33.82243006904891]]], "type": "Polygon"}, "properties": {"datetime": "2018-02-12T12:30:22Z", "landsat:scene_id": "LC82081612020043LGN00", "landsat:row": "161", "gsd": 15, "landsat:revision": "00", "view:sun_azimuth": -148.83296771, "instrument": "OLI_TIRS", "landsat:product_id": "LC08_L1GT_208161_20200212_20200212_01_RT", "eo:cloud_cover": 0, "landsat:tier": "RT", "landsat:processing_level": "L1GT", "landsat:column": "208", "platform": "landsat-8", "proj:epsg": 32756, "view:sun_elevation": -37.30791534, "view:off_nadir": 0, "height": 2500, "width": 2500}, "bbox": [149.57574, -34.25796, 152.15194, -32.07915], "collection": "test-collection", "assets": {}, "links": [{"href": "http://localhost:8081/collections/landsat-8-l1/items/LC82081612020043", "rel": "self", "type": "application/geo+json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "parent", "type": "application/json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "collection", "type": "application/json"}, {"href": "http://localhost:8081/", "rel": "root", "type": "application/json"}]}'

curl --request POST --url http://localhost:8084/collections/test-collection/items --header 'Content-Type: application/json' --data '{"type": "Feature", "id": "test-item3", "stac_version": "1.0.0", "stac_extensions": ["https://stac-extensions.github.io/eo/v1.0.0/schema.json", "https://stac-extensions.github.io/projection/v1.0.0/schema.json"], "geometry": {"coordinates": [[[152.15052873427666, -33.82243006904891], [150.1000346138806, -34.257132625788756], [149.5776607193635, -32.514709769700254], [151.6262528041627, -32.08081674221862], [152.15052873427666, -33.82243006904891]]], "type": "Polygon"}, "properties": {"datetime": "2018-02-12T12:30:22Z", "landsat:scene_id": "LC82081612020043LGN00", "landsat:row": "161", "gsd": 15, "landsat:revision": "00", "view:sun_azimuth": -148.83296771, "instrument": "OLI_TIRS", "landsat:product_id": "LC08_L1GT_208161_20200212_20200212_01_RT", "eo:cloud_cover": 0, "landsat:tier": "RT", "landsat:processing_level": "L1GT", "landsat:column": "208", "platform": "landsat-8", "proj:epsg": 32756, "view:sun_elevation": -37.30791534, "view:off_nadir": 0, "height": 2500, "width": 2500}, "bbox": [149.57574, -34.25796, 152.15194, -32.07915], "collection": "test-collection", "assets": {}, "links": [{"href": "http://localhost:8081/collections/landsat-8-l1/items/LC82081612020043", "rel": "self", "type": "application/geo+json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "parent", "type": "application/json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "collection", "type": "application/json"}, {"href": "http://localhost:8081/", "rel": "root", "type": "application/json"}]}'

curl --request POST --url http://localhost:8084/collections/test-collection/items --header 'Content-Type: application/json' --data '{"type": "Feature", "id": "test-item4", "stac_version": "1.0.0", "stac_extensions": ["https://stac-extensions.github.io/eo/v1.0.0/schema.json", "https://stac-extensions.github.io/projection/v1.0.0/schema.json"], "geometry": {"coordinates": [[[152.15052873427666, -33.82243006904891], [150.1000346138806, -34.257132625788756], [149.5776607193635, -32.514709769700254], [151.6262528041627, -32.08081674221862], [152.15052873427666, -33.82243006904891]]], "type": "Polygon"}, "properties": {"datetime": "2018-02-12T12:30:22Z", "landsat:scene_id": "LC82081612020043LGN00", "landsat:row": "161", "gsd": 15, "landsat:revision": "00", "view:sun_azimuth": -148.83296771, "instrument": "OLI_TIRS", "landsat:product_id": "LC08_L1GT_208161_20200212_20200212_01_RT", "eo:cloud_cover": 0, "landsat:tier": "RT", "landsat:processing_level": "L1GT", "landsat:column": "208", "platform": "landsat-8", "proj:epsg": 32756, "view:sun_elevation": -37.30791534, "view:off_nadir": 0, "height": 2500, "width": 2500}, "bbox": [149.57574, -34.25796, 152.15194, -32.07915], "collection": "test-collection", "assets": {}, "links": [{"href": "http://localhost:8081/collections/landsat-8-l1/items/LC82081612020043", "rel": "self", "type": "application/geo+json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "parent", "type": "application/json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "collection", "type": "application/json"}, {"href": "http://localhost:8081/", "rel": "root", "type": "application/json"}]}'

curl --request POST --url http://localhost:8084/collections/test-collection/items --header 'Content-Type: application/json' --data '{"type": "Feature", "id": "test-item5", "stac_version": "1.0.0", "stac_extensions": ["https://stac-extensions.github.io/eo/v1.0.0/schema.json", "https://stac-extensions.github.io/projection/v1.0.0/schema.json"], "geometry": {"coordinates": [[[152.15052873427666, -33.82243006904891], [150.1000346138806, -34.257132625788756], [149.5776607193635, -32.514709769700254], [151.6262528041627, -32.08081674221862], [152.15052873427666, -33.82243006904891]]], "type": "Polygon"}, "properties": {"datetime": "2018-02-12T12:30:22Z", "landsat:scene_id": "LC82081612020043LGN00", "landsat:row": "161", "gsd": 15, "landsat:revision": "00", "view:sun_azimuth": -148.83296771, "instrument": "OLI_TIRS", "landsat:product_id": "LC08_L1GT_208161_20200212_20200212_01_RT", "eo:cloud_cover": 0, "landsat:tier": "RT", "landsat:processing_level": "L1GT", "landsat:column": "208", "platform": "landsat-8", "proj:epsg": 32756, "view:sun_elevation": -37.30791534, "view:off_nadir": 0, "height": 2500, "width": 2500}, "bbox": [149.57574, -34.25796, 152.15194, -32.07915], "collection": "test-collection", "assets": {}, "links": [{"href": "http://localhost:8081/collections/landsat-8-l1/items/LC82081612020043", "rel": "self", "type": "application/geo+json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "parent", "type": "application/json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "collection", "type": "application/json"}, {"href": "http://localhost:8081/", "rel": "root", "type": "application/json"}]}'

curl --request POST --url http://localhost:8084/collections/test-collection/items --header 'Content-Type: application/json' --data '{"type": "Feature", "id": "test-item6", "stac_version": "1.0.0", "stac_extensions": ["https://stac-extensions.github.io/eo/v1.0.0/schema.json", "https://stac-extensions.github.io/projection/v1.0.0/schema.json"], "geometry": {"coordinates": [[[152.15052873427666, -33.82243006904891], [150.1000346138806, -34.257132625788756], [149.5776607193635, -32.514709769700254], [151.6262528041627, -32.08081674221862], [152.15052873427666, -33.82243006904891]]], "type": "Polygon"}, "properties": {"datetime": "2018-02-12T12:30:22Z", "landsat:scene_id": "LC82081612020043LGN00", "landsat:row": "161", "gsd": 15, "landsat:revision": "00", "view:sun_azimuth": -148.83296771, "instrument": "OLI_TIRS", "landsat:product_id": "LC08_L1GT_208161_20200212_20200212_01_RT", "eo:cloud_cover": 0, "landsat:tier": "RT", "landsat:processing_level": "L1GT", "landsat:column": "208", "platform": "landsat-8", "proj:epsg": 32756, "view:sun_elevation": -37.30791534, "view:off_nadir": 0, "height": 2500, "width": 2500}, "bbox": [149.57574, -34.25796, 152.15194, -32.07915], "collection": "test-collection", "assets": {}, "links": [{"href": "http://localhost:8081/collections/landsat-8-l1/items/LC82081612020043", "rel": "self", "type": "application/geo+json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "parent", "type": "application/json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "collection", "type": "application/json"}, {"href": "http://localhost:8081/", "rel": "root", "type": "application/json"}]}'

curl --request POST --url http://localhost:8084/collections/test-collection/items --header 'Content-Type: application/json' --data '{"type": "Feature", "id": "test-item7", "stac_version": "1.0.0", "stac_extensions": ["https://stac-extensions.github.io/eo/v1.0.0/schema.json", "https://stac-extensions.github.io/projection/v1.0.0/schema.json"], "geometry": {"coordinates": [[[152.15052873427666, -33.82243006904891], [150.1000346138806, -34.257132625788756], [149.5776607193635, -32.514709769700254], [151.6262528041627, -32.08081674221862], [152.15052873427666, -33.82243006904891]]], "type": "Polygon"}, "properties": {"datetime": "2018-02-12T12:30:22Z", "landsat:scene_id": "LC82081612020043LGN00", "landsat:row": "161", "gsd": 15, "landsat:revision": "00", "view:sun_azimuth": -148.83296771, "instrument": "OLI_TIRS", "landsat:product_id": "LC08_L1GT_208161_20200212_20200212_01_RT", "eo:cloud_cover": 0, "landsat:tier": "RT", "landsat:processing_level": "L1GT", "landsat:column": "208", "platform": "landsat-8", "proj:epsg": 32756, "view:sun_elevation": -37.30791534, "view:off_nadir": 0, "height": 2500, "width": 2500}, "bbox": [149.57574, -34.25796, 152.15194, -32.07915], "collection": "test-collection", "assets": {}, "links": [{"href": "http://localhost:8081/collections/landsat-8-l1/items/LC82081612020043", "rel": "self", "type": "application/geo+json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "parent", "type": "application/json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "collection", "type": "application/json"}, {"href": "http://localhost:8081/", "rel": "root", "type": "application/json"}]}'

curl --request POST --url http://localhost:8084/collections/test-collection/items --header 'Content-Type: application/json' --data '{"type": "Feature", "id": "test-item8", "stac_version": "1.0.0", "stac_extensions": ["https://stac-extensions.github.io/eo/v1.0.0/schema.json", "https://stac-extensions.github.io/projection/v1.0.0/schema.json"], "geometry": {"coordinates": [[[152.15052873427666, -33.82243006904891], [150.1000346138806, -34.257132625788756], [149.5776607193635, -32.514709769700254], [151.6262528041627, -32.08081674221862], [152.15052873427666, -33.82243006904891]]], "type": "Polygon"}, "properties": {"datetime": "2018-02-12T12:30:22Z", "landsat:scene_id": "LC82081612020043LGN00", "landsat:row": "161", "gsd": 15, "landsat:revision": "00", "view:sun_azimuth": -148.83296771, "instrument": "OLI_TIRS", "landsat:product_id": "LC08_L1GT_208161_20200212_20200212_01_RT", "eo:cloud_cover": 0, "landsat:tier": "RT", "landsat:processing_level": "L1GT", "landsat:column": "208", "platform": "landsat-8", "proj:epsg": 32756, "view:sun_elevation": -37.30791534, "view:off_nadir": 0, "height": 2500, "width": 2500}, "bbox": [149.57574, -34.25796, 152.15194, -32.07915], "collection": "test-collection", "assets": {}, "links": [{"href": "http://localhost:8081/collections/landsat-8-l1/items/LC82081612020043", "rel": "self", "type": "application/geo+json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "parent", "type": "application/json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "collection", "type": "application/json"}, {"href": "http://localhost:8081/", "rel": "root", "type": "application/json"}]}'

curl --request POST --url http://localhost:8084/collections/test-collection/items --header 'Content-Type: application/json' --data '{"type": "Feature", "id": "test-item9", "stac_version": "1.0.0", "stac_extensions": ["https://stac-extensions.github.io/eo/v1.0.0/schema.json", "https://stac-extensions.github.io/projection/v1.0.0/schema.json"], "geometry": {"coordinates": [[[152.15052873427666, -33.82243006904891], [150.1000346138806, -34.257132625788756], [149.5776607193635, -32.514709769700254], [151.6262528041627, -32.08081674221862], [152.15052873427666, -33.82243006904891]]], "type": "Polygon"}, "properties": {"datetime": "2018-02-12T12:30:22Z", "landsat:scene_id": "LC82081612020043LGN00", "landsat:row": "161", "gsd": 15, "landsat:revision": "00", "view:sun_azimuth": -148.83296771, "instrument": "OLI_TIRS", "landsat:product_id": "LC08_L1GT_208161_20200212_20200212_01_RT", "eo:cloud_cover": 0, "landsat:tier": "RT", "landsat:processing_level": "L1GT", "landsat:column": "208", "platform": "landsat-8", "proj:epsg": 32756, "view:sun_elevation": -37.30791534, "view:off_nadir": 0, "height": 2500, "width": 2500}, "bbox": [149.57574, -34.25796, 152.15194, -32.07915], "collection": "test-collection", "assets": {}, "links": [{"href": "http://localhost:8081/collections/landsat-8-l1/items/LC82081612020043", "rel": "self", "type": "application/geo+json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "parent", "type": "application/json"}, {"href": "http://localhost:8081/collections/landsat-8-l1", "rel": "collection", "type": "application/json"}, {"href": "http://localhost:8081/", "rel": "root", "type": "application/json"}]}'

3. Ran this search query:

curl --request POST \
  --url http://localhost:8084/search \
  --header 'Content-Type: application/json' \
  --data '{
    "limit": 2,
    "fields": {
        "exclude": ["geometry", "links", "assets", "type", "bbox", "properties", "stac_version", "collection"]
    },
    "sortby": [
        {
                    "field": "id",
                    "direction": "asc"
        }
    ]
}'

Reponse was:

{
    "type": "FeatureCollection",
    "features": [
        {
            "id": "test-item1"
        },
        {
            "id": "test-item2"
        }
    ],
    "links": [
        {
            "rel": "next",
            "type": "application/json",
            "method": "POST",
            "href": "http://localhost:8084/search",
            "body": {
                "limit": 2,
                "fields": {
                    "exclude": [
                        "geometry",
                        "links",
                        "assets",
                        "type",
                        "bbox",
                        "properties",
                        "stac_version",
                        "collection"
                    ]
                },
                "sortby": [
                    {
                        "field": "id",
                        "direction": "asc"
                    }
                ],
                "token": "dGVzdC1pdGVtMw=="
            }
        },
        {
            "rel": "root",
            "type": "application/json",
            "href": "http://localhost:8084/"
        },
        {
            "rel": "self",
            "type": "application/json",
            "href": "http://localhost:8084/search"
        }
    ],
    "context": {
        "returned": 2,
        "limit": 2,
        "matched": 9
    }
}

4. Ran the same search query with the addition of the token from the previous response:

curl --request POST \
  --url http://localhost:8084/search \
  --header 'Content-Type: application/json' \
  --data '{
    "limit": 2,
    "fields": {
        "exclude": ["geometry", "links", "assets", "type", "bbox", "properties", "stac_version", "collection"]
    },
    "sortby": [
        {
            "field": "id",
            "direction": "asc"
        }
    ],
    "token": "dGVzdC1pdGVtMw=="
}'

Response was:

{
    "type": "FeatureCollection",
    "features": [
        {
            "id": "test-item4"
        },
        {
            "id": "test-item5"
        }
    ],
    "links": [
        {
            "rel": "next",
            "type": "application/json",
            "method": "POST",
            "href": "http://localhost:8084/search",
            "body": {
                "limit": 2,
                "fields": {
                    "exclude": [
                        "geometry",
                        "links",
                        "assets",
                        "type",
                        "bbox",
                        "properties",
                        "stac_version",
                        "collection"
                    ]
                },
                "sortby": [
                    {
                        "field": "id",
                        "direction": "asc"
                    }
                ],
                "token": "dGVzdC1pdGVtNg=="
            }
        },
        {
            "rel": "root",
            "type": "application/json",
            "href": "http://localhost:8084/"
        },
        {
            "rel": "self",
            "type": "application/json",
            "href": "http://localhost:8084/search"
        }
    ],
    "context": {
        "returned": 2,
        "limit": 2
    }
}

This response seems to have skipped test-item3 and returned test-item4 and test-item5. Continuing with the following token I was returned test-item7 and test-item8 (skipping test-item6) and after that I was returned with nothing (skipping test-item9)

I assume there's something wrong or perhaps I'm doing something wrong ?

jonhealy1 commented 6 months ago

No you aren't doing anything wrong. I remember this issue now when I created this. It might just be something funky here if you want to play around with it: https://github.com/Healy-Hyperspatial/stac-fastapi-mongo/blob/e59c421c816abe69ced82b472e3b1b837bf6495f/stac_fastapi/mongo/database_logic.py#L562-L574

jonhealy1 commented 6 months ago

If we can't get this working we should look at something like this maybe (using skip in mongo): https://www.mongodb.com/docs/atlas/atlas-search/tutorial/divide-results-tutorial/#std-label-pagination-skip-limit-tutorial

pedro-cf commented 6 months ago

I'm not quite fond of what the token's complete purpose is, but assuming it's only meant to hold the amount of "skipping" we need to perform in the database, I've wrote this possible solution which seems to work with the same tests (also in descending order):

https://github.com/pedro-cf/stac-fastapi-mongo/commit/730840a98dc9e587c9bd93e0579b2e20a7ecc4c1

jonhealy1 commented 6 months ago

Looks really good. There are a few pagination tests that are presently being skipped in tests/resources/items.py, for example here: https://github.com/Healy-Hyperspatial/stac-fastapi-mongo/blob/e59c421c816abe69ced82b472e3b1b837bf6495f/stac_fastapi/tests/resources/test_item.py#L556-L662