Job tracker storage - Githubissues

Open-EO / openeo-python-client

Python client API for OpenEO

https://open-eo.github.io/openeo-python-client/

Apache License 2.0

151 stars 40 forks source link

Job tracker storage #580

Closed VincentVerelst closed 2 months ago

VincentVerelst commented 3 months ago

@soxofaan , for issue #571 I created a JobTrackerStorage to manage everything job tracker related in the MultiBackendJobManager. For now it only supports storing as CSV to PosixPath, so no extra functionality has been added. Would be great to have your feedback already on this before adding functionality for storing as Parquet and storing to URL as well.

jdries commented 3 months ago

Abstracting it in a separate class is the way to go. I am however wondering if we should keep the abstraction where the full dataframe is always loaded, or of the 'storage tracker' can offer more granular functionality?

For instance methods like (please improve my bad naming):

get_jobs_by_status(max_jobs, status:List[String]) (returns jobs for given status list) update_job_metadata(job_row) (writes updated job state back to the database) commit() (persist modifications to file or database)

soxofaan commented 3 months ago

I am however wondering if we should keep the abstraction where the full dataframe is always loaded,

Indeed, but that involves a bigger refactor (or maybe even full rewrite) of the multi job manager I guess. In this first PR we want to make a first step by abstracting out just the storage aspect, without fiddling too much with the rest

jdries commented 3 months ago

@soxofaan yes, fine to keep this one limited and then move on to the bigger story. I mainly wanted to inform you to avoid that we put a lot of work in polishing this one, to only then replace it with something else afterwards.

VincentVerelst commented 2 months ago

@soxofaan , as discussed earlier, I've abstracted away the read/write helper for CSV and Parquet files and allowed the user to give a custom read/write helper (with the eye on adding support for URL's for which a user might need to provide a custom request method). An interim review from you would be helpful, before continuing to URL support.

soxofaan commented 2 months ago

FYI: I already pushed some tweaks of my own to address my latest notes