Open-EO / openeo-geopyspark-driver

OpenEO driver for GeoPySpark (Geotrellis)
Apache License 2.0
25 stars 4 forks source link

Automatic UDF dependency handling in synchronous processing #791

Open soxofaan opened 3 weeks ago

soxofaan commented 3 weeks ago

Spin-off from https://github.com/Open-EO/openeo-geopyspark-driver/issues/237:

https://github.com/Open-EO/openeo-geopyspark-driver/issues/237 added automatic Python UDF dependency handling for batch jobs.

Do we need/want the same for synchronous processing? Batch jobs run in isolation, so it's straightforward to do per-job dependency handling. With synchronous processing we don't (yet) have the same level of isolation, so automatic dependency handling is out of the question at the moment.

At least we should have a cleaner warning/error message for the user when they seem to expect it to work

jdries commented 1 week ago

I would go for the error message + clear documentation. Also have to take security into account here, so better to not rush this.