Open-EO / openeo-geotrellis-extensions

Java/Scala extensions for Geotrellis, for use with OpenEO GeoPySpark backend.
Apache License 2.0
5 stars 3 forks source link

Add array_apply. #158

Closed EmileSonneveld closed 1 year ago

EmileSonneveld commented 1 year ago

It was easier for me to test this on openeo-dev.vito.be. Array apply gives the same result netCDF as calling a function in the original way:

# the returning datacube has multiple bands and multiple times
datacube = connection.load_collection(
    "TERRASCOPE_S2_TOC_V2",
    spatial_extent={"west": 5.07, "south": 51.215, "east": 5.08, "north": 51.22},
    temporal_extent=["2020-03-01", "2020-03-30"],
    bands=["B01", "B02", "B03"],
)
# process = lambda x: cos(x)  # <- baseline
process = lambda x: x.array_apply(cos)
datacube_applied = datacube.apply_dimension(dimension='t', process=process)
job = datacube_applied.execute_batch(
    title=os.path.basename(__file__),
    format="netCDF",
)

@jdries Is this enough for a test? image

jdries commented 1 year ago

yes, I assume we also have unit tests right? Then we can merge, and we can do a more advanced case with rank composites.

EmileSonneveld commented 1 year ago

@jdries I added unit tests JVM side and Python side: https://github.com/Open-EO/openeo-geotrellis-extensions/commit/1876e806c73871505212e4add44712c687a7ab04 and https://github.com/Open-EO/openeo-geopyspark-driver/commit/1387884b14cee8160b8776c050ecd1b1522cc558

For this ticket I worked the main branches because I only saw how to test datacubes with multiple dates once deployed.