Open-EO / openeo-gfmap

Generic framework for EO mapping applications building on openEO
Apache License 2.0
3 stars 0 forks source link

Usage of Threadpool instead of seperate threads #58

Closed VictorVerhaert closed 2 months ago

VictorVerhaert commented 2 months ago

When using GFMap from a notebook, I experience some issues concerning threads still running after the run_jobs finish.

This could be resolved by using a threadpool (with specified number of threads) and starting on_job_error and on_job_done using apply_async. This would remove the need for the _post_job_worker.

The advantage is that we can use join() on the thread pool to wait for all the tasks to be finished, and then using close() to ensure no threads stay alive and cause unpredictable behavior.

example run_jobs (not tested):

with multiprocessing.pool.ThreadPoolExecutor(max_workers=self._n_threads):
    super().run_jobs(..., pool) #pass pool as argument so jobs can be launched, or save it as a class variable
    pool.join()

and in _update_statusses:

pool.submit(on_job_done, self, job, row)
GriffinBabe commented 2 months ago

Fixed by #69