temporalio / temporal

Temporal service
https://docs.temporal.io
MIT License
11.94k stars 843 forks source link

Provide priority task queues #1507

Open oskwazir opened 3 years ago

oskwazir commented 3 years ago

Is your feature request related to a problem? Please describe. Currently there is no way to assign priority to a task and also ensure fairness in workers executing tasks.

Quote from sjansen on community.temporal.io

Imagine, for example, a parent workflow that starts an expensive child workflow for each user in the account. If multiple accounts start the parent workflow near the same time, an account with a much larger number of users could monopolize all available workers by simply being first to queue up a large number of activities associated with each user. When there’s no contention, it’s desirable for an account to be able to use 100% of available workers but as soon as there’s contention between accounts it’s desirable to attempt fair scheduling in order to keep latency proportional to account or workflow size.

Obviously it would be possible to partition capacity by creating separate queues for each account, but the result is either potentially significant idle capacity or latency waiting for workers to scale out.

Describe the solution you'd like Be able to assign priority to a task, which ensures that the task queue is always ordered based on the highest priority. Fair scheduling of tasks is important too, so that high priority work doesn't consume all workers leaving lower priority work queued.

Describe alternatives you've considered Multiple task queues which don't necessarily solve this problem because it allows for idle workers or latency in scaling up workers.

Additional context This feature request stems from a question & discussion on the community website: https://community.temporal.io/t/rate-limiting-based-on-metadata/385/

wxing1292 commented 3 years ago

BTW, first the there must be consensus on the design

the feature request will require the following:

SDK: allow workflow logic providing priority of workflow / activity

Server:

rainfd commented 2 years ago

Are there any plans to support priority taskqueue or workflow?

PenguinToast commented 2 years ago

Roughly how much work would this be to implement? We're pretty interested in having this sort of functionality and would be happy to help out with implementation here.

medihack commented 2 years ago

Also super interested. We would like to replace a Celery setup (esprecially Canvas workflows) with Temporal, but we make heavy use of RabbitMQ Message Priorities which is not available in Temporal.

amajedi-a commented 1 year ago

I'm exploring a similar concept, something I'm curious about is how being able to assign task priority would unlock fairness in execution?

dnr commented 1 year ago

Priorities and fairness are definitely features we're looking at as we evolve the matching portion of Temporal. I think it's fair to say that it will be a lot of work to develop and productionize, and it's going to involve the internals in the matching service, so it might be difficult for outside developers to work on. At this point, the most helpful thing might be to describe your requirements in some more detail, so we can keep them in mind as we do the design (since "priority" and "fairness" sometimes mean slightly different things to different people.) E.g. how do you want to specify priorities, with what granularity, do you need reserved capacity for different priorities?

A few more notes:

medihack commented 1 year ago

In our case, we are looking for a replacement for Celery which itself uses RabbitMQ queue priorities. If there is for example only one worker it will always take the task with the highest priority (even if that task is later in the queue than others with lower priority).

zedongh commented 1 year ago

Is there any work in progress?

medihack commented 1 year ago

Is it completed? Where can I find more information?

dnr commented 1 year ago

Sorry, the close/open was a github issues mishap. This is still planned but no ETA yet.

tarampampam commented 1 year ago

Any update on this? Lack of prioritization is not infrequently a reason for abandoning Temporal in my practice...

charlesmelby commented 1 year ago

Dealing with large work spikes from individual clients is common in our systems. This feature would greatly simplify management and help make a stronger case for adopting Temporal.

Multiply commented 1 year ago

Dealing with large work spikes from individual clients is common in our systems. This feature would greatly simplify management and help make a stronger case for adopting Temporal.

Can't you run multiple task queues, and manage it by swapping which ones you want to handle?

gmintoco commented 5 months ago

this would be very useful for us too :) primarily the priority queue idea

darkms commented 1 week ago

Having priorities and fairness would definitely help to increase Temporal adoption in our company.

Our use case for priority is that we have the same operations used for real-time use cases, such as when user click a button in UI and expect immediate answer, and reused for asynchronous batch processing, but with much larger scale (can make all workers fully utilized for minutes-hours per each batch). Ideally we'd want to reuse all the same infrastructure, avoid over-provisioning & idle resources, and just have Temporal workers execute items with higher priority before going to lower priority ones (ie have worker activity queue be sorted by priority). Currently we work-around this by slightly over-provisioning our infra and reserving some capacity for real-time use cases while limiting the activity tasks concurrency for workers. Sadly it is not always enough to process sudden influx of real-time requests, but also when there aren't any real-time requests that leaves the reserved capacity idle, leading to inefficient utilization.

Our use case for fairness is to allocate available service capacity equally for each group (user/tenant/other grouping dimension), but at the same time also we are OK with allowing even a single group to consume all of available capacity if nobody else is using the service at the moment. So far we haven't found a way to achieve it with Temporal, we're still using our bespoke external scheduler component with centralized database that schedules items 1by1 up to pre-configured amount of concurrently running jobs, I don't really like it much because it's so hard to reuse across services. Would absolutely love if Temporal offered a similar feature out of the box, maybe it could be in form of job scheduler configuration where we could define dynamic priority formula with ability to lookup # of workflow/activities with same workflow/activity attributes already being executed (at the time of scheduling).

atihkin commented 2 days ago

Thanks @darkms that's really good input.

The team is actively looking into this feature so we welcome feedback on use cases.