stac-utils / stac-fastapi-elasticsearch-opensearch

Elasticsearch backend for stac-fastapi with Opensearch support.
https://stac-utils.github.io/stac-fastapi-elasticsearch-opensearch
MIT License
24 stars 14 forks source link

Simple search returns items onlly with properties.datetime field #217

Open iliion opened 3 months ago

iliion commented 3 months ago

Simple search http://localhost:8000/search?limit=10 returns items onlly with properties.datetime field

In particular in line https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/blob/987f0b924b45e3f63302758b6e63d38ba504964f/stac_fastapi/core/stac_fastapi/core/core.py#L593C12-L593C64

filter_kwargs = search_request.fields.filter_fields

filter_fields is equal to

{'exclude': {}, 'include': {'assets': Ellipsis, 'bbox': Ellipsis, 'collection': Ellipsis, 'geometry': Ellipsis, 'id': Ellipsis, 'links': Ellipsis, 'properties': {'datetime'}, 'stac_version': Ellipsis, 'type': Ellipsis}}

As a response i get the FeatureCollection

{
  "type": "FeatureCollection",
  "features": [ {
    "type": "Feature",
    "properties": {
      "datetime": "2015-07-04T10:10:06.027000+00:00"
    },
    . . .
  ]
  . . .
}

Shouldnt the filter_kwargs` be equal to

{'exclude': {}, 'include': {'assets': Ellipsis, 'bbox': Ellipsis, 'collection': Ellipsis, 'geometry': Ellipsis, 'id': Ellipsis, 'links': Ellipsis, 'properties': Ellipsis, 'stac_version': Ellipsis, 'type': Ellipsis}}

FYI. The item has the following properties

"properties": {
  "sat:relative_orbit": 22,
  "start_datetime": "2015-07-04T10:10:06.027Z",
  "end_datetime": "2015-07-04T10:10:06.027Z",
  "processing:facility": "EPA_",
  "title": "S2A_MSIL2A_20150704T101006_N0204_R022_T30NYK_20150704T102420.SAFE",
  "platform": "Sentinel-2A",
  "view:sun_elevation": 30.130819756471,
  "datetime": "2015-07-04T10:10:06.027Z",
  "instruments": [],
  "constellation": "sentinel-2",
  "sat:orbit_state": "descending",
  "eo:cloud_cover": 84.456044,
  "grid:code": "MGRS-30NYK",
  "processing:level": "L2A",
  "view:incidence_angle": 9.82920248878181,
  "created": "2019-08-05T14:29:54Z",
  "sentinel:product_id": "S2A_MSIL2A_20150704T101006_N0204_R022_T30NYK_20150704T102420",
  "view:sun_azimuth": 49.058168068692,
  "sentinel:datastrip_id": "S2A_USER_MSI_L2A_DS_EPA__20160808T224207_S20150704T102420_N02.04",
  "sentinel:processing_baseline": "02.04",
  "proj:bbox": [],
  "proj:epsg": 32630,
  "bdap:additional_attributes": {},
  "processed": "2019-08-05T14:53:58.579614Z",
  "mission": "Sentinel-2",
  "view:azimuth": 100.76592332834346,
  "gsd": 10,
  "sat:absolute_orbit": 162,
  "sentinel:acquisition_station": "EPA_"
}
jonhealy1 commented 3 months ago

By default, the FieldsExtension spec in the stac-fastapi parent library only includes one property, datetime . https://github.com/stac-utils/stac-fastapi/blob/4fb10ec6ba758f28ddef749ea7bac6fbb5963f9b/stac_fastapi/extensions/stac_fastapi/extensions/core/fields/fields.py#L46

jonhealy1 commented 3 months ago

To get all your properties do this: http://localhost:8080/search?fields+properties&limit=10

iliion commented 3 months ago

Yes indeed. What you wrote is true. Nevertheless is this the correct behavior according to stac api specs ? A simple search will always return only one property?

In https://api.stacspec.org/v1.0.0/item-search/ the response contains all the propery fields

On Fri, Mar 22, 2024, 8:01 PM Jonathan Healy @.***> wrote:

To get all your properties do this: http://localhost:8080/search?fields+properties&limit=10

— Reply to this email directly, view it on GitHub https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/issues/217#issuecomment-2015635402, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD4JT4R6O2KGBU6ADMBWAWTYZRWWJAVCNFSM6AAAAABFDQ7J3WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJVGYZTKNBQGI . You are receiving this because you authored the thread.Message ID: <stac-utils/stac-fastapi-elasticsearch-opensearch/issues/217/2015635402@ github.com>

jonhealy1 commented 3 months ago

I am guessing they only include one property because there are many very large stac datasets out there online and returning the minimum amount of information makes a lot of sense. Especially if you're only displaying items on a map I guess. I would make this an issue in the stac-fastapi parent repo. With a small limit and if you're using pagination, it doesn't really offer any performance benefits to hide most of the properties.

jonhealy1 commented 3 months ago

They should be following the stac api spec though ideally.

iliion commented 3 months ago

Ideally yes. I would add another use case where all the properties are needed. In my case I have more than 20 fields and the only option I have is to pass the 20+ arguments us8ng the field extension. It makes more sense to return everything and filter only if needed

On Sat, Mar 23, 2024, 5:14 PM Jonathan Healy @.***> wrote:

They should be following the stac api spec though ideally.

— Reply to this email directly, view it on GitHub https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/issues/217#issuecomment-2016521625, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD4JT4VYVMZRLDJAUEDD4OTYZWL57AVCNFSM6AAAAABFDQ7J3WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJWGUZDCNRSGU . You are receiving this because you authored the thread.Message ID: <stac-utils/stac-fastapi-elasticsearch-opensearch/issues/217/2016521625@ github.com>

jonhealy1 commented 3 months ago

You can just do +properties to show all the properties like this http://localhost:8080/search?fields+properties&limit=10

jonhealy1 commented 3 months ago

It would be interesting to see what people think over there

iliion commented 3 months ago

I think it would be best to follow the STAC API specs https://api.stacspec.org/v1.0.0/item-search/

There are also implementations like STAC Browser that use the search endpoint in order to return items. As a user I would expect to see all the properties of an Item.

jonhealy1 commented 3 months ago

Would you mind adding the issue here? This is where the relevant issue comes from. https://github.com/stac-utils/stac-fastapi/issues

vincentsarago commented 2 weeks ago

🤔 it's interesting, there is a different behaviour between -elasticsearch and -pgstac.

PgSTAC does not use the PostFieldsExtension. filter_fields as ElasticSearch is doing in https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/blob/987f0b924b45e3f63302758b6e63d38ba504964f/stac_fastapi/core/stac_fastapi/core/core.py#L593C12-L593C64

and will in fact does not rely on default_includes 😅 (This is a BUG or a feature?).

POST

$ curl --header "Content-Type: application/json" --request POST --data '{"fields":{},"limit":1}' https://stac.eoapi.dev/search | jq '.features[0].properties'
{
  "created": "2023-12-19T07:04:49.631129Z",
  "license": "CC-BY-4.0",
  "mission": "umbra-sample-data",
  "updated": "2023-12-19T07:04:49.631129Z",
  "datetime": "2023-12-01T19:20:52Z",
  "platform": "UMBRA_08",
  "providers": [
    {
      "url": "https://umbra.space",
      "name": "umbra",
      "roles": [
        "host",
        "processor",
        "producer",
        "licensor"
      ]
    }
  ],
  "end_datetime": "2023-12-01T19:20:53.654917Z",
  "constellation": "umbra",
  "start_datetime": "2023-12-01T19:20:52Z",
  "sar:looks_range": 1,
  "sar:product_type": "GEC",
  "sar:looks_azimuth": 1,
  "sar:polarizations": [
    "VV"
  ],
  "sar:frequency_band": "X",
  "sar:instrument_mode": "SPOTLIGHT",
  "sar:center_frequency": 9.600010292488152,
  "sar:resolution_range": 0.8346077999881579,
  "view:incidence_angle": 50.55698046197228,
  "sar:resolution_azimuth": 0.852999818905946,
  "sar:observation_direction": "left"
}

$ curl --header "Content-Type: application/json" --request POST --data '{"fields":{"include":["properties.end_datetime"]},"limit":1}' https://stac.eoapi.dev/search | jq '.features[0].properties'
{
  "end_datetime": "2023-12-01T19:20:53.654917Z"
}

GET

$ curl --header "Content-Type: application/json"  https://stac.eoapi.dev/search\?limit\=1 | jq '.features[0].properties'
{
  "created": "2023-12-19T07:04:49.631129Z",
  "license": "CC-BY-4.0",
  "mission": "umbra-sample-data",
  "updated": "2023-12-19T07:04:49.631129Z",
  "datetime": "2023-12-01T19:20:52Z",
  "platform": "UMBRA_08",
  "providers": [
    {
      "url": "https://umbra.space",
      "name": "umbra",
      "roles": [
        "host",
        "processor",
        "producer",
        "licensor"
      ]
    }
  ],
  "end_datetime": "2023-12-01T19:20:53.654917Z",
  "constellation": "umbra",
  "start_datetime": "2023-12-01T19:20:52Z",
  "sar:looks_range": 1,
  "sar:product_type": "GEC",
  "sar:looks_azimuth": 1,
  "sar:polarizations": [
    "VV"
  ],
  "sar:frequency_band": "X",
  "sar:instrument_mode": "SPOTLIGHT",
  "sar:center_frequency": 9.600010292488152,
  "sar:resolution_range": 0.8346077999881579,
  "view:incidence_angle": 50.55698046197228,
  "sar:resolution_azimuth": 0.852999818905946,
  "sar:observation_direction": "left"
}

$ curl --header "Content-Type: application/json"  https://stac.eoapi.dev/search\?limit\=1\&fields\=properties.end_datetime | jq '.features[0].properties'
{
  "end_datetime": "2023-12-01T19:20:53.654917Z"
}
vincentsarago commented 2 weeks ago

I think I'm going to propose that we deprecate the default_includes, because: