kagkarlsson / db-scheduler

Persistent cluster-friendly scheduler for Java
Apache License 2.0
1.23k stars 188 forks source link

Adding support for task priority #181

Open FreCap opened 3 years ago

FreCap commented 3 years ago

Currently all tasks have a FIFO policy (or whether time they are scheduled).

In a number of use cases, where queue is larger than the processing power, we might want to pull higher priority tasks first (at cost of starvation of lower priority ones).

This would not be a breaking change, since by default all tasks would have same priority and hence the ones with the lowest executionTime would always be run first.

Can this feature be of any interest to the broader community?

kagkarlsson commented 3 years ago

For high-throughput use-cases I can see this could be useful. For example avoiding that a high number of one-time tasks prevent recurring tasks from running. Recurring tasks would probably have a higher priority by default.

Currently I think a workaround would be to run multiple Scheduler instances, backed by different tables.

FreCap commented 3 years ago

This can be helpful also in low throughput, long queues situations.

Ideally we could have something that is not a workaround, since the queues could possibly be many, and then we have to deal with which threads have to get priority over others.

To achieve this we should expose a method like runAnyDueExecutions with a few changes: we should loop getDue and PickAndExecute until we reach pollingLimit (or another variable like nTaskRunLimit) image This is a method that in any case I would love to expose in the scheduler (runDueExecutions(int nTaskRunLimit)), please let me knwo if it is ok.

@kagkarlsson do you think the priority queue change could be merged back in? Are there specific design considerations you would like to take into account?

kagkarlsson commented 3 years ago

For adding support for task-priority to the scheduler, wouldn't the best solution be to add the priority field to the database table, and have the query for getDue fetch executions order by priority desc, execution_time asc ? (though this will be a bit tricky index well I think)

This is a method that in any case I would love to expose in the scheduler (runDueExecutions(int nTaskRunLimit)), please let me knwo if it is ok.

What would such a method do? Are you looking to trigger check for due executions ad-hoc, i.e. outside the normal polling-interval?

FreCap commented 3 years ago

For adding support for task-priority to the scheduler, wouldn't the best solution be to add the priority field to the database table, and have the query for getDue fetch executions order by priority desc, execution_time asc ? (though this will be a bit tricky index well I think)

Perfect

Yes, runDueExecutions(Instant now, int nTaskRunLimit) would be an ad hoc trigger. RunForeverExecutor can have limitations in terms of lifecycle, e.g. If if you have multiple types of tasks (with different dataClass) and we want to run all tasks of dataClass A before running the ones of dataClass B, we cannot achieve it.