Reduce wait time for small batch jobs by running synchronously

Open-EO / openeo-geopyspark-driver

OpenEO driver for GeoPySpark (Geotrellis)

Apache License 2.0

26 stars 4 forks source link

Reduce wait time for small batch jobs by running synchronously #369

Open jdries opened 1 year ago

jdries commented 1 year ago

Batch jobs need to be scheduled on yarn/kubernetes, which introduces an overhead that is hard to avoid. If certain conditions are met, we can also schedule batch jobs in the same way as synchronous jobs:

small input size
no UDF's
no custom job options that require increasing memory? (Spark's stage-level scheduling might help remove this requirement)

soxofaan commented 1 year ago

I've also been thinking about this. And I think it was also brought up during that brainstorm day last year.

An in-between solution could also be to maintain a pool of longer living batch job workers as applications on the cluster, which just execute batch job tasks (e.g. picking from a central queue) and stay alive in between. You then save on the overhead to start a job on the cluster, but you still have isolation from the web app.