execute_local_udf uses different datacube shape compared to Terrascope backend

Open-EO / openeo-python-client

Python client API for OpenEO

https://open-eo.github.io/openeo-python-client/

Apache License 2.0

156 stars 42 forks source link

execute_local_udf uses different datacube shape compared to Terrascope backend #479

Closed JeroenVerstraelen closed 1 year ago

JeroenVerstraelen commented 1 year ago

I noticed that with execute_local_udf we use a different shape for the datacube compared to the one used in the VITO/Terrascope backend.

Local: https://github.com/Open-EO/openeo-python-client/blob/master/openeo/udf/xarraydatacube.py#L329 ("t", "bands", "x", "y")

Backend: https://github.com/Open-EO/openeo-geopyspark-driver/blob/master/openeogeotrellis/geopysparkdatacube.py#L481 ("t", "bands", "y", "x")

This makes local UDF debugging different/confusing compared to running the UDF remotely.

JeroenVerstraelen commented 1 year ago

If you read in an image or netcdf into numpy/xarray it will also be of shape (height, width) so I believe it would make more sense to the user if we used (t-bands-y-x) wherever possible.

JeroenVerstraelen commented 1 year ago

In the last discussion it was decided to ensure that the user writes code that works independent of the ordering of the input or output datacube. So they should use named accessors as much as possible. If they do return a datacube that is not in (t, bands, y, x) format then the backend will automatically transpose this to the correct order (using the dimension names as guide).

In the geopyspark-driver we already do this for spatiotemporal UDFs but not yet for spatial UDFs: https://github.com/Open-EO/openeo-geopyspark-driver/blob/master/openeogeotrellis/geopysparkdatacube.py#L764

I will push a commit for this soon.

soxofaan commented 1 year ago

I think this now fixed and can be closed