geopython / pygeoapi

pygeoapi is a Python server implementation of the OGC API suite of standards. The project emerged as part of the next generation OGC API efforts in 2018 and provides the capability for organizations to deploy a RESTful OGC API endpoint using OpenAPI, GeoJSON, and HTML. pygeoapi is open source and released under an MIT license.
https://pygeoapi.io
MIT License
460 stars 250 forks source link

Time querying in `xarray_edr.py` #1608

Open sjordan29 opened 3 months ago

sjordan29 commented 3 months ago

Description When I make an EDR query on a dataset for a single datetime, pygeoapi is returning an empty dataset. I think the following section is causing the error,

        try:
            if select_properties:
                self.fields = {k: v for k, v in self.fields.items() if k in select_properties}  # noqa
                data = self._data[[*select_properties]]
            else:
                data = self._data

            if self.time_field in query_params:
                remaining_query = {
                    key: val for key, val in query_params.items()
                    if key != self.time_field
                }
                if isinstance(query_params[self.time_field], slice):
                    time_query = {
                        self.time_field: query_params[self.time_field]
                    }
                else:
                    time_query = {
                        self.time_field: (
                                data[self.time_field].dt.date ==
                                query_params[self.time_field]
                        )
                    }
                data = data.sel(
                    time_query).sel(remaining_query, method='nearest')
            else:
                data = data.sel(query_params, method='nearest')

specifically

time_query = {
    self.time_field: (
            data[self.time_field].dt.date ==
            query_params[self.time_field]
            )

Steps to Reproduce Steps to reproduce the behavior:

An example dataset

alaska_et_2020_ccsm4_historical_simulation:
    description:
      Gridded 20km Daily Reference Evapotranspiration for the State of
      Alaska from 1979 to 2017
    extents:
      spatial:
        bbox:
          - -179.9994335861477
          - 37.24836569956209
          - 179.99945621556017
          - 88.17225180794964
        crs: http://www.opengis.net/def/crs/OGC/1.3/CRS84
      temporal:
        begin: 1979-04-01 00:00:00+00:00
        end: 2100-09-30 00:00:00+00:00
    keywords:
      - s3
      - usgs
      - alaska_et_2020_ccsm4_historical_simulation
    providers:
      - data: s3://mdmf/gdp/alaska_et_2020_ccsm4_historical_simulation.zarr
        format:
          mimetype: application/zip
          name: zarr
        name: xarray-edr
        options:
          s3:
            anon: true
            client_kwargs:
              endpoint_url: https://usgs.osn.mghpcc.org/
        type: edr
        x_field: x
        y_field: y
    title: alaska_et_2020_ccsm4_historical_simulation
    type: collection

Here's the query:

2024-04-02 12:15:39,366 - INFO - http://localhost:5001/api/gdp/pygeoapi/collections/alaska_et_2020_ccsm4_historical_simulation/position?coords=POINT+%28-179.9994335861477+37.24836569956209%29&datetime=1979-04-01T00%3A00%3A00%2B00%3A00&parameter_names=et0

Expected behavior I expected to get a coverage json with a single timestep at a single point in my dataset. Instead, I got a 204 error. Logs below (specifically - ERROR - None and ERROR - No data).

Screenshots/Tracebacks

024-04-02 12:15:39 [2024-04-02T13:15:39Z] {/mambaforge/lib/python3.11/site-packages/pygeoapi/provider/xarray_edr.py:108} DEBUG - query parameters: {'x': -179.9994335861477, 'y': 37.24836569956209, 'time': numpy.datetime64('1979-04-01T00:00:00')}
2024-04-02 12:15:39 /mambaforge/lib/python3.11/site-packages/pygeoapi/provider/xarray_edr.py:141: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
2024-04-02 12:15:39   height = data.dims[self.y_field]
2024-04-02 12:15:39 /mambaforge/lib/python3.11/site-packages/pygeoapi/provider/xarray_edr.py:145: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
2024-04-02 12:15:39   width = data.dims[self.x_field]
2024-04-02 12:15:39 [2024-04-02T13:15:39Z] {/mambaforge/lib/python3.11/site-packages/pygeoapi/api.py:3813} ERROR - None
2024-04-02 12:15:39 [2024-04-02T13:15:39Z] {/mambaforge/lib/python3.11/site-packages/pygeoapi/api.py:4012} ERROR - No data found
2024-04-02 12:15:39 [2024-04-02 13:15:39 -0400] [70] [ERROR] Exception in ASGI application
2024-04-02 12:15:39 Traceback (most recent call last):
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi
2024-04-02 12:15:39     result = await app(  # type: ignore[func-returns-value]
2024-04-02 12:15:39              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in __call__
2024-04-02 12:15:39     return await self.app(scope, receive, send)
2024-04-02 12:15:39            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
2024-04-02 12:15:39     await self.middleware_stack(scope, receive, send)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
2024-04-02 12:15:39     raise exc
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
2024-04-02 12:15:39     await self.app(scope, receive, _send)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-04-02 12:15:39     await self.app(scope, receive, send)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
2024-04-02 12:15:39     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
2024-04-02 12:15:39     raise exc
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2024-04-02 12:15:39     await app(scope, receive, sender)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/routing.py", line 756, in __call__
2024-04-02 12:15:39     await self.middleware_stack(scope, receive, send)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/routing.py", line 776, in app
2024-04-02 12:15:39     await route.handle(scope, receive, send)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/routing.py", line 485, in handle
2024-04-02 12:15:39     await self.app(scope, receive, send)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/routing.py", line 756, in __call__
2024-04-02 12:15:39     await self.middleware_stack(scope, receive, send)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/routing.py", line 776, in app
2024-04-02 12:15:39     await route.handle(scope, receive, send)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/routing.py", line 297, in handle
2024-04-02 12:15:39     await self.app(scope, receive, send)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/routing.py", line 77, in app
2024-04-02 12:15:39     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
2024-04-02 12:15:39     raise exc
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2024-04-02 12:15:39     await app(scope, receive, sender)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/routing.py", line 75, in app
2024-04-02 12:15:39     await response(scope, receive, send)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/responses.py", line 159, in __call__
2024-04-02 12:15:39     await send({"type": prefix + "http.response.body", "body": self.body})
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/_exception_handler.py", line 50, in sender
2024-04-02 12:15:39     await send(message)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/_exception_handler.py", line 50, in sender
2024-04-02 12:15:39     await send(message)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/starlette/middleware/errors.py", line 161, in _send
2024-04-02 12:15:39     await send(message)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 504, in send
2024-04-02 12:15:39     output = self.conn.send(event=h11.Data(data=data))
2024-04-02 12:15:39              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/h11/_connection.py", line 512, in send
2024-04-02 12:15:39     data_list = self.send_with_data_passthrough(event)
2024-04-02 12:15:39                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/h11/_connection.py", line 545, in send_with_data_passthrough
2024-04-02 12:15:39     writer(event, data_list.append)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/h11/_writers.py", line 65, in __call__
2024-04-02 12:15:39     self.send_data(event.data, write)
2024-04-02 12:15:39   File "/mambaforge/lib/python3.11/site-packages/h11/_writers.py", line 91, in send_data
2024-04-02 12:15:39     raise LocalProtocolError("Too much data for declared Content-Length")
2024-04-02 12:15:39 h11._util.LocalProtocolError: Too much data for declared Content-Length

Environment

Additional context I did some testing opening up the dataset outside of pygeoapi and using the logic to parse out the timestep part of the query, and ran into an error with the lines of code that I highlighted above -- there was no data available to query because no timesteps were selected. This was the original solution I had implemented a while back

 if (datetime_ is not None and
                isinstance(query_params[self.time_field], slice)): # noqa
                # separate query into spatial and temporal components
                LOGGER.debug('Separating temporal query')
                time_query = {self.time_field:
                              query_params[self.time_field]}
                remaining_query = {key: val for key,
                                   val in query_params.items()
                                   if key != self.time_field}
                data = data.sel(time_query).sel(remaining_query,
                                                method='nearest')
            else:
                data = data.sel(query_params, method='nearest')
github-actions[bot] commented 11 hours ago

This Issue has been inactive for 90 days. As per RFC4, in order to manage maintenance burden, it will be automatically closed in 7 days.