Project-OMOTES / orchestrator

GNU General Public License v3.0
0 stars 0 forks source link

Job in Orchestrator which can timeout a running workflow #39

Closed lfse-slafleur closed 3 months ago

lfse-slafleur commented 4 months ago

Each job received by the orchestrator contains a timeout (db_models/job.py:JobDB.timeout_after_ms). We consider a job to have timed out if datetime.now() > JobDB.running_at + timedelta(ms=JobDB.timeout_after_ms. Jobs may be in various states, so only jobs where JobDB.status == JobStatus.RUNNING are considered.

Once the orchestrator and this component has started, it should only start timing out workflows AFTER the orchestrator has been online for 60 seconds. Otherwise we might run into the situation that jobs, which have been completed while the orchestrator were offline, are timed out immediately upon startup.

This component should accept the PostgresInterface and Orchestrator as constructor arguments. The component MUST be isolated from other components in the orchestrator.

Please make sure to put any constants in logical places and make the 60 seconds timeframe configurable. We use .env files as configuration files so please extend .env-template with any new configuration parameters.

Where to add to orchestrator:

Relevant functions:

cwang39403 commented 3 months ago

Closed after the #50 merged