Open-EO / openeo-geotrellis-extensions

Java/Scala extensions for Geotrellis, for use with OpenEO GeoPySpark backend.
Apache License 2.0
5 stars 3 forks source link

implement array_apply #154

Open jdries opened 1 year ago

jdries commented 1 year ago

Add to OpenEOProcessScriptBuilder https://processes.openeo.org/#array_apply

EmileSonneveld commented 1 year ago

@jdries , would this be a good example how the usage of array_apply should look like?

precipitation_dc = connection.load_collection(
    "AGERA5",
    temporal_extent=["2023-01-01", "2023-01-30"],
    bands=["precipitation-flux"],
)

precipitation_band = precipitation_dc.band("precipitation-flux")

mean_rain = precipitation_band.aggregate_spatial(
    geometries=input_geojson,
    reducer="mean",
)

mean_rain_applied = mean_rain.array_apply("cos")  # <---------- HERE

job = mean_rain_applied.execute_batch(
    title=os.path.basename(__file__),
    format="json",
)
jdries commented 1 year ago

Rather something like: precipitation_band.apply_dimension(dimension='t', process=lambda x: x.array_apply(lambda y:cos(y)))

jdries commented 1 year ago

Some additional pointers: Function that you want to end up using: geotrellis.raster.MultibandTile#mapBands The array_ family of functions can provide some inspiration for the implementation: https://github.com/Open-EO/openeo-geotrellis-extensions/blob/5256279fa7f5788174a4b842ca9300296a5f77f3/openeo-geotrellis/src/main/scala/org/openeo/geotrellis/OpenEOProcessScriptBuilder.scala#L897

jdries commented 1 year ago

One note: add to changelog make sure that it shows up in list of processes supported by backend. Is something in openeo-python-driver I think.

EmileSonneveld commented 1 year ago

I added it to the CHANGELOG.md. Crazy enough, array_apply was already in the documentation: image

EmileSonneveld commented 1 year ago

The code is not completely ready yet. As @soxofaan noted: The sub-proces graph passed to array-apply seems merged in the main process graph. This could mix up in an unpredictable way.

jdries commented 1 year ago

Here's a process graph that doesn't seem to work yet:

{
  "process_graph": {
    "applyneighborhood1": {
      "arguments": {
        "data": {
          "from_node": "ndvi1"
        },
        "overlap": [],
        "process": {
          "process_graph": {
            "arrayapply1": {
              "arguments": {
                "data": {
                  "from_parameter": "data"
                },
                "process": {
                  "process_graph": {
                    "add1": {
                      "arguments": {
                        "x": {
                          "from_parameter": "x"
                        },
                        "y": 20
                      },
                      "process_id": "add",
                      "result": true
                    }
                  }
                }
              },
              "process_id": "array_apply",
              "result": true
            }
          }
        },
        "size": [
          {
            "dimension": "x",
            "unit": "px",
            "value": 1
          },
          {
            "dimension": "y",
            "unit": "px",
            "value": 1
          },
          {
            "dimension": "t",
            "value": "month"
          }
        ]
      },
      "process_id": "apply_neighborhood"
    },
    "filterbbox1": {
      "arguments": {
        "data": {
          "from_node": "applyneighborhood1"
        },
        "extent": {
          "crs": "epsg:4326",
          "east": 4.5,
          "north": 51.17,
          "south": 51.16,
          "west": 4.45
        }
      },
      "process_id": "filter_bbox"
    },
    "loadcollection1": {
      "arguments": {
        "bands": [
          "B04",
          "B08"
        ],
        "id": "SENTINEL2_L2A",
        "properties": {
          "eo:cloud_cover": {
            "process_graph": {
              "lte1": {
                "arguments": {
                  "x": {
                    "from_parameter": "value"
                  },
                  "y": 85
                },
                "process_id": "lte",
                "result": true
              }
            }
          }
        },
        "spatial_extent": null,
        "temporal_extent": [
          "2022-06-04",
          "2022-08-04"
        ]
      },
      "process_id": "load_collection"
    },
    "loadcollection2": {
      "arguments": {
        "bands": [
          "SCL"
        ],
        "id": "SENTINEL2_L2A",
        "properties": {
          "eo:cloud_cover": {
            "process_graph": {
              "lte2": {
                "arguments": {
                  "x": {
                    "from_parameter": "value"
                  },
                  "y": 85
                },
                "process_id": "lte",
                "result": true
              }
            }
          }
        },
        "spatial_extent": null,
        "temporal_extent": [
          "2022-06-04",
          "2022-08-04"
        ]
      },
      "process_id": "load_collection"
    },
    "mask1": {
      "arguments": {
        "data": {
          "from_node": "loadcollection1"
        },
        "mask": {
          "from_node": "toscldilationmask1"
        }
      },
      "process_id": "mask"
    },
    "ndvi1": {
      "arguments": {
        "data": {
          "from_node": "mask1"
        },
        "nir": "B08",
        "red": "B04"
      },
      "process_id": "ndvi"
    },
    "saveresult1": {
      "arguments": {
        "data": {
          "from_node": "filterbbox1"
        },
        "format": "netCDF",
        "options": {}
      },
      "process_id": "save_result",
      "result": true
    },
    "toscldilationmask1": {
      "arguments": {
        "data": {
          "from_node": "loadcollection2"
        },
        "erosion_kernel_size": 3,
        "kernel1_size": 17,
        "kernel2_size": 77,
        "mask1_values": [
          2,
          4,
          5,
          6,
          7
        ],
        "mask2_values": [
          3,
          8,
          9,
          10,
          11
        ]
      },
      "process_id": "to_scl_dilation_mask"
    }
  }
}
jdries commented 1 year ago

ok, didn't notice before, but the array_apply support is explicitly disabled, hence not working

jdries commented 1 year ago

I just committed some improvements to support the index and label parameters in array_apply. What is still open is to evaluate the 'process' callback in an isolated context rather than as part of the main pg evaluation.