Job system (aka celery beat schedule in database)

The goal of this task is to build a administration interface the provides an overview over jobs running in the system, be able to see the output/results and start/stop/schedule new jobs.

Examples of jobs could be:

Harvest GND/CORDIS/OpenAIRE
File integrity check
Metadata checker
Sitemap updater
DataCite updater
Data retention lifecycle

The difference from Flower monitoring of celery is that it shows all individual tasks and it's very hard to find a given task and its output and have no proper persistence.

Resources

CERN as allocated time to work on this during May.

UI

Job overview

Have a list of all configured jobs, last run, who started the last run, and when the next run is.
Allow to manually run a job (with possibility to set a time, change the settings), change the schedule, change other settings, clone, delete the job and create a new job, stop a job

Job detail

The view of an individual job should provide an overview of the last month's (or another timeframe) runs of the job, and the summary output.
Have similar possibilities to manage the job as in the list view.

Job detail logs (out of scope)

Eventually it woudl be nice to be able to get a full logging output from a job to inspect in detail what happened:

Design

Overall, the goal is to move the celery beat schedule into the database so it can be managed from there. To provide a small wrapping layer around a celery task which takes care of the output logging etc from the task.

inveniosoftware / product-rdm