Open-EO / openeo-processes

Interoperable processes for openEO's big Earth observation cloud processing.
https://processes.openeo.org
Apache License 2.0
48 stars 16 forks source link

`merge_cubes` description #379

Closed soxofaan closed 1 year ago

soxofaan commented 2 years ago

(here we are again, discussing merge_cubes' description :smile: )

a couple of notes on the current description of merge_cubes

The data cubes have to be compatible

I don't think we properly defined what "compatible" means. I also think the first sentence of the description here should be a more verbose version of the summary and at least mention the verb "merge".

A merge operation without overlap should be reversible with (a set of) filter operations for each of the two cubes.

I think it's a bit strange to have this as second sentence of the description. We haven't properly defined "overlap" yet. I guess the sentence tries to define "mergecubes" as the inverse of "filter*", but don't think that is very clear at the moment.

The process performs the join on overlapping dimensions, with the same name and type.

I think using the wording "join on a dimension" is not ideal because "join on" is classic relational database terminology and is closer to "concatenate" than "merge" (which is actually meant here)

An overlapping dimension has the same name, type, reference system and resolution in both dimensions, but can have different labels.

There is something wrong here: "A ... dimension has the same name ... in both dimensions". I guess what is meant here:

Overlapping dimensions have the same name, type, reference system and resolution, but can have different labels

But then: isn't overlap about having the same labels, so why does this talk about different labels?

One of the dimensions can have different labels, for all other dimensions the labels must be equal. If data overlaps, the parameter overlap_resolver must be specified to resolve the overlap.

soxofaan commented 2 years ago

I think part of the confusion is because the current description makes it easy to mix up "dimension overlap" and "cube overlap". It should be explained how they relate to each other. And maybe it is easier to talk in terms of disjoint dimensions (a pair of corresponding dimensions with no common labels), instead of trying to explain everything in terms of overlap. Roughly:

  • if there is at least one pair of corresponding dimensions that are disjoint: the cubes are not overlapping -> overlap resolver is not necessary
  • otherwise the cubes are overlapping -> overlap resolver is necessary
m-mohr commented 1 year ago

I'm trying to capture all(?) points in #405.