PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
17.48k stars 1.64k forks source link

Check concurrency limit before submitting tasks to TaskRunner #8539

Open j-tr opened 1 year ago

j-tr commented 1 year ago

First check

Prefect Version

2.x

Describe the current behavior

Concurrency limits are checked after the task is submitted to the TasksRunner. For distributed task runners like Ray, this causes upscaling based on the total number of submitted tasks and not on the number of runnable tasks according to the concurrency limits. Especially for flows with a large number of tasks but rather low concurrency limits, this causes massive overprovisioning of compute resources.

Describe the proposed behavior

Only submit tasks to the task runner when free concurrency slots are available, which would lead to upscaling based on the concurrency limits and not the total number of tasks.

Example Use

No response

Additional context

No response

zanieb commented 1 year ago

Unfortunately this is a bit difficult as scaling to large numbers of tasks requires that orchestration of the tasks occur in a distributed manner.

zanieb commented 1 year ago

@j-tr would a client-side setting that controls the number of concurrently submitting task runs suffice for you?

j-tr commented 1 year ago

@madkinsz I'm afraid that a purely client-side approach would probably not help very much. we already considered application-level workarounds that only submit a certain number of tasks at a time, but since our main objective is to protect a database and we have no control over the number of concurrently running flows, we need some sort of global limit. However, a read-only solution that reads the number of available concurrency slots from the server and only submits an according number of tasks would already reduce the amount of overprovisioning by a bit and keep most of the orchestration logic distributed.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 30 days with no activity. To keep this issue open remove stale label or comment.

trahloff commented 1 year ago

Hi @billpalombi, is this capability something that is on the current roadmap?