Each job received by the orchestrator contains a timeout (db_models/job.py:JobDB.timeout_after_ms). We consider a job to have timed out if datetime.now() > JobDB.running_at + timedelta(ms=JobDB.timeout_after_ms. Jobs may be in various states, so only jobs where JobDB.status == JobStatus.RUNNING are considered.
Once the orchestrator and this component has started, it should only start timing out workflows AFTER the orchestrator has been online for 60 seconds. Otherwise we might run into the situation that jobs, which have been completed while the orchestrator were offline, are timed out immediately upon startup.
This component should accept the PostgresInterface and Orchestrator as constructor arguments. The component MUST be isolated from other components in the orchestrator.
Please make sure to put any constants in logical places and make the 60 seconds timeframe configurable. We use .env files as configuration files so please extend .env-template with any new configuration parameters.
Where to add to orchestrator:
main.py:main should initialize this component (with the other components) (and should not yet START the component!)
main.py:Orchestrator.init should expect this component as a constructor argument.
main.py:Orchestrator.start should start this component.
main.py:Orchestrator.stop should stop this component.
Relevant functions:
postgres_interface.py:PostgresInterface.get_all_jobs may be used to retrieve all jobs.
postgres_interface.py:PostgresInterface.delete_job may be used to delete a single job.
main.py:Orchestrator.job_cancellation_handler This function should be split into 2. job_cancellation_handler should continue to accept a JobCancel message and run the new function cancel_job. The new function cancel_job should accept a job_id: uuid.UUID
Each job received by the orchestrator contains a timeout (db_models/job.py:JobDB.timeout_after_ms). We consider a job to have timed out if
datetime.now() > JobDB.running_at + timedelta(ms=JobDB.timeout_after_ms
. Jobs may be in various states, so only jobs whereJobDB.status == JobStatus.RUNNING
are considered.Once the orchestrator and this component has started, it should only start timing out workflows AFTER the orchestrator has been online for 60 seconds. Otherwise we might run into the situation that jobs, which have been completed while the orchestrator were offline, are timed out immediately upon startup.
This component should accept the PostgresInterface and Orchestrator as constructor arguments. The component MUST be isolated from other components in the orchestrator.
Please make sure to put any constants in logical places and make the 60 seconds timeframe configurable. We use .env files as configuration files so please extend .env-template with any new configuration parameters.
Where to add to orchestrator:
Relevant functions:
JobCancel
message and run the new functioncancel_job
. The new functioncancel_job
should accept ajob_id: uuid.UUID