Open-EO / openeo-processes

Interoperable processes for openEO's big Earth observation cloud processing.
https://processes.openeo.org
Apache License 2.0
48 stars 16 forks source link

result_meta dimension in aggregate_spatial #356

Closed soxofaan closed 1 year ago

soxofaan commented 2 years ago

The devtelco of this afternoon led me to this " total_count and valid_count" feature of aggregate_spatial I was not aware of.

https://github.com/Open-EO/openeo-processes/blob/b81e96b454b808e32aa94128ba71daf5b06daa19/aggregate_spatial.json#L81

The computation also stores information about the total count of pixels (valid + invalid pixels) and the number of valid pixels for each geometry. These values are added as a new dimension with a dimension name derived from target_dimension by adding the suffix _meta. The new dimension has the dimension labels total_count and valid_count.

I'm not sure this makes sense: you can not just "add a new dimension" to append metadata

For example, assume the input raster cube is a 2D (x,y) raster cube

Purely looking at the aggregation results, the output is also a 2D vector cube with a vector dimension (with a label for each geometry) and a "result" dimension (with just a single item, e.g. label "aggregation"). Attempt to visualize:

"aggregation"
geom1 1.23
geom2 4.56
... ...

First observation: the "result" dimension is actually not necessary, the result could just be a 1D vector cube.

Now about the result_meta dimension and its labels total_count and valid_count. If you simply "add" this dimension, then you get a 3D vector cube, with dimensions:

but your aggregations are lost here: at what coordinates in this cube is the original aggregation stored? This result cube has only metadata.

I think what is intended is that total_count and valid_count should be new labels in the "result" dimension. So the "result" dimension would have labels "aggregation", "total_count", "valid_count"

soxofaan commented 2 years ago

related to https://github.com/openEOPlatform/architecture-docs/issues/84

soxofaan commented 2 years ago

also related to the discussion around https://github.com/Open-EO/openeo-processes/issues/341#issuecomment-1067886942

m-mohr commented 2 years ago

Yes, I also stumbled across this recently and it needs to be rephrased based also on the vector cube definition later. One potential solution is indeed what you proposed and is also given as an example in the mentioned discussion: https://github.com/Open-EO/openeo-processes/issues/341#issuecomment-1068179876

soxofaan commented 2 years ago

is also given as an example in the mentioned discussion: https://github.com/Open-EO/openeo-processes/issues/341#issuecomment-1068179876

with the difference, I think, that the "result" and "band" dimension should be separate, instead of flattened as in that example

m-mohr commented 2 years ago

How should that work? You did argue above that this doesn't work if I understood it correctly?

soxofaan commented 2 years ago

How should that work?

It's like my original example, but one additional dimension:

start from 3D raster cube: (x, y, bands) output should be (I think): 3D vector cube: