Closed GriffinBabe closed 1 month ago
Whenever a crash happens from the user-code, the GFMAP manager loses it's stac collection progress as it is only written whenever the manager finishes it's jobs.
One temporary way of tackling that would be to simply add a try/except clause as such:
try:
manager.run_jobs(job_df, create_datacube_optical, tracking_df_path)
except Exception as e:
_pipeline_log.error("Error during the job execution: %s", e)
finally:
manager.create_stac(constellation='sentinel2', item_assets={"auxiliary": AUXILIARY})
This should in-theory save only fully initialized STAC items (crashing points should be considered from the output_path_gen
, post_job_action
, create_job
user-functions, all of which are called before adding any item to the collection):
self._root_collection.add_items(job_items)
@VincentVerelst However I was thinking that it would be maybe better to call create_stac
function automatically within the manager, so that STAC is automatically handled during a crash. The usage of a job manager could look like this:
manager = GFMAPJobManager(...)
manager.setup_stac(constellation='sentinel2', item_assets={'auxiliary': AUXILIARY})
manager.run_jobs(...) # Will can _create_stac internally
Tell me what do you think 😄
@GriffinBabe, sounds like a good idea! I don't see any benefit in the user having to call create_stac
themselves. Also like the idea of having a setup_stac
. Maybe we can also make this one optional? i.e. only if the user is interested in changing the STAC metadata, they need to call it, otherwise GFMap will generate a default STAC collection based on which constellation is selected.
It can happen that the GFMAPJobManager crashes. Not necessarily due to errors on gfmap side, but also from bad user code in post-job actions.
MultiBackendJobManager
behavior.At the moment, persistence is done through the
job_tracking.csv
file and the base logic in theMultiBackendJobManager
https://github.com/Open-EO/openeo-python-client/blob/master/openeo/extra/job_management.py#L32