Open jerstlouis opened 2 years ago
We should consider use cases where we want to aggregate / return results differently for different dimensions, for example:
A) Return a 0D value including derived "minimum NDVI" and "maximum NDVI" values aggregated locally spatially over the time dimension at a single point in space, but averaged over the spatial dimensions. B) Support aggregating to a time series at a coarser resolution but not to a single value over a dimension, e.g. computing monthly minimum, maximum or average for each months of a year.
How could that look like syntactically? Possibly an additional parameter to the aggregation function to select dimensions on which to aggregate? e.g. time
, space
, spacetime
, [latitude, longitude, datetime]
.
A) Aggregate minimum of spatially local values over time, then aggregate average over space (a single cell is returned with a minimum and a maximum value)
properties=
minNDVI:Avg(
Min((B5-B4)/(B5+B4), time),
space),
maxNDVI:Avg(
Max((B5-B4)/(B5+B4), time),
space)
&subset=datetime("2020-01-01":"2021-12-31"),Lat(45.0:45.1),Lon(-75.1:-75.0)
With an additional option to specify aggregating to a coarser resolution, as opposed to a single value? e.g., time:month
, Lat:0.005
B) Aggregate minimum of spatially local values over time for each given month, then aggregate sum over space. The result would be a 1D time series with 12 cells (data records / features), each with a single value in this case (the sum for each of the monthly minimums and maximums, over all subsetted space).
properties=
minNDVI:Sum(
Min((B5-B4)/(B5+B4), time:month),
space),
maxNDVI:Sum(
Max((B5-B4)/(B5+B4), time:month),
space),
&subset=datetime("2020-01-01":"2020-12-31"),Lat(45.0:45.1),Lon(-75.1:-75.0)
A special month
resolution is proposed in the example here to accommodate common usage uneven temporal units. A number corresponding to units (e.g., in seconds or meters or degrees) could also be used to qualify the dimension over which aggregation is performed.
DAPA had some similar ideas for its aggregate
query parameter, but more so for the different aggregation processes (area:aggregate-space
, area:aggregate-space-time
, area:aggregate-time
, grid:aggregate-time
, position:aggregate-time
).
To compare aggregating a gridded coverage with the Features search extension, cells are akin to the features in that their set of properties have given values. Aggregation is essentially creating a new collection of cells (equivalent to a new feature collection) with different dimensionality and/or resolution across some dimension(s).
Note that if aggregation is simply functions used in derived fields properties, then the resulting dimensionality may differ if returned properties use different kinds of aggregation -- that could mean fields that are not aggregated over some dimensions or resolution would get duplicated.
Another use case for sortby
might be to more explicitly specify the behavior associated with the subset slice sparse data behavior discussed in #105, if e.g., the time dimension is included as a sortable. That could be combined with other sortable keys, including derived fields using aggregation e.g., Avg()
over space (but not time) to sort scenes as a whole without mixing them up.
sortby=
-Avg((B5-B4)/(B5+B4), space),
+time
It would be great to have a list or tree like the one at https://github.com/cportele/ogcapi-building-blocks
This would help to visualise what the building blocks are.
@ghobona I tried at the top of this issue to organize them in a bullet list.
Most of these building blocks are query parameters:
Analytics Query parameters:
properties
filter
sortby
collections
FROM <tables>
in SQL). The fields can then be prefixed by the {collectionId}.
to disambiguate them.Aggregating functions
Min()
Max()
Sum()
Avg()
StdDev()
Spatiotemporal Subsetting Query parameters:
subset
datetime
bbox
Thanks @jerstlouis !
Cc: @doublebyte1
See also https://github.com/opengeospatial/ogcapi-features/issues/927#issuecomment-2179414229 regarding suggestion to use deepObject style:
properties=temp_c&alias[temp_c]=(temp_f - 32) * 5/9.
Suggesting that we plan for a separate part enabling basic analytics capabilities, including conformance classes for:
properties=
filter=
sortby=
Max()
,Min()
,Avg()
,StdDev()
,Sum()
... used withinproperties=
,filter=
,sortby=
expressions. The dimensions over which data is aggregated could also leveragesubset
,bbox
,datetime
, but a distinction mechanism would still be needed to know whether a series should be returned for a particular dimension, or aggregation should be performed.collections=
This would be informed by the work from DAPA and Testbed-17 GeoDataCube API, and ideally be consistent with the OGC API - Features Search extension as well as with OGC API - DGGS and OGC API - EDR. We plan to explore this in the upcoming May 2022 Code Sprint.
Example proposed syntax:
properties=NDVI:Max((B5-B4)/(B5+B4))&subset=("2020-07-01":"2020-07-31")