Open-EO / openeo-python-client

Python client API for OpenEO
https://open-eo.github.io/openeo-python-client/
Apache License 2.0
143 stars 36 forks source link

Avoid index based "band math" where possible #582

Open soxofaan opened 1 week ago

soxofaan commented 1 week ago

"Band math" is a often used feature in the openeo python client. For example, very roughly.:

cube = connection.load_collection("SENTINEL2_L2A")
b2 = cube.band("B02")
b8 = cube.band("B08")
res = b2 - b8

This res cube will translate to a process graph with a reduce_dimension along the lines of:

{
  "process_id": "reduce_dimension",
  "arguments": {
    "dimension": "bands",
    "reducer": { "process_graph": {
        "arrayelement1": {
          "process_id": "array_element",
          "arguments": {"data": {"from_parameter": "data"}, "index": 1}
        },
        "arrayelement2": {
          "process_id": "array_element",
          "arguments": {"data": {"from_parameter": "data"}, "index": 7}
        },
        "subtract1": {
          "process_id": "subtract",

note how band("B02") and band("B08") were translated to array_element(..., index=1) and array_element(..., index=7) respectively.

This index based array_element usage is quite brittle as it depends on a predictable, consistent band order. In the context of an openeo federation where federated collection from multiple participants are merged, it is unfortunately not straightforward to guarantee a predictable, consistent band listing (e.g see https://github.com/Open-EO/openeo-aggregator/issues/147)

Index based band math should be avoided as much as possible and label based array_element should become the default. Some reasons it might still be useful to have index based: backend does not support label based (e.g. Terrascope backend in the past), or the client might be confused about the actual band names (e.g. after multiple possible band-name manipulating operations)