Open-EO / openeo-processes

Interoperable processes for openEO's big Earth observation cloud processing.
https://processes.openeo.org
Apache License 2.0
48 stars 15 forks source link

run_udf on vector cubes: define allowed Python representations #398

Closed jdries closed 1 year ago

jdries commented 1 year ago

We need to standardize on a Python representation for vector cubes, specifically for use with UDF's.

For rasters, we use Xarray, and to some extent this would work for vector cubes except that geometries are probably not supported as dimension labelds. Geopandas would be another candidate.

We currently use a simple Python dict but this is then basically an entirely custom data structure that the user needs to know.

m-mohr commented 1 year ago

I'm not sure that this is something that the processes need to define. Isn't this specific to the UDF runtime? In the R UDFs you'd get a stars object and have native vector cube support, in Python it is not natively supported yet, but it seemed there was something going on recently https://discourse.pangeo.io/t/vector-data-cubes/2904 which resulted in https://github.com/martinfleis/xvec So should we move this issue to the Python UDF repo?

jdries commented 1 year ago

Interesting! Problem is, it somehow needs to be standardized, and the 'Python udf runtime' repo is basically unmaintained. The UDF related classes that are considered public api are now in the openEO python client. I could move the issue in there, but wanted to specifically discuss standardization here...

Of course, given that XArray is already supported by the current API, it's perhaps the easiest way to go, and will allow easy migration into something like xvec

m-mohr commented 1 year ago

Go ahead with whatever you think is required. I don't think I can help a lot here. Not sure what the best place to discuss this is, but I'm relatively sure documenting it in the processes is not the right way as it would otherwise also need documentation for all other (potential) UDF runtimes. So this should really go into the UDF runtime documentation IMHO.