PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
15.95k stars 1.57k forks source link

Move tag-based concurrency management into clients #14382

Closed abrookins closed 2 months ago

abrookins commented 3 months ago

Move tag-based concurrency handling client-side, implemented with global concurrency limits. This fixes #14360 and forms part of our larger effort to move all elements of task orchestration client-side.

Limitations and future work:

Example

This PR changes tag-based task concurrency to use global concurrency limits. When a global concurrency limit exists whose name matches a tag in your task, we will apply that limit to the task when it runs. If you want to create a limit to match a tag on your task, you should create a global concurrency limit, not a task run concurrency limit. Future work will likely consolidate these concepts.

Checklist

zhen0 commented 2 months ago

@abrookins - doing a bit of maintenance as we have a lot of potentially stale PRs. Is this one that needs action? Or can it be closed?

abrookins commented 2 months ago

Still working on this one! 👍

abrookins commented 2 months ago

@zangell44 Good questions! We may need to expand the concurrency v2 API with more functionality. I'm thinking about this now. I didn't understand question #4.

codspeed-hq[bot] commented 2 months ago

CodSpeed Performance Report

Merging #14382 will not alter performance

Comparing global-concurrency-tags (69e8665) with main (0d23f58)

Summary

✅ 5 untouched benchmarks

zangell44 commented 2 months ago

I think the create_if_missing kwarg + functionality resolves questions 2 and 4. I do think 1 and 3 are still worthy of consideration.

1.) The concurrency v2 api does not track which object took a given slot. If the client crashes mid-run, we have no way of recovering a slot automatically. 3.) 3.x and 2.x task run limits will not be compatible with one another

3 may not have a solution outside of documenting the behavior.

abrookins commented 2 months ago

@zangell44 For 1), I think global concurrency limits should be able to tell you who or what is using them. I plan to add an API endpoint in a follow-up PR that looks at limit acquired and limit released events within a time window to flow or task runs currently using the limit.

For 4), the story should be a little simpler now that client-side concurrency limits will ship with client-side orchestration. That allows us to dump the version-checking code server-side because clients will only be using this new concurrency approach when they use client-side orchestration.

However, the fact remains that if you use client-side orchestration with a task whose tags you had previously created limits for, you would currently need to recreate the limits as global concurrency limits. I haven't spent much time thinking through how to smooth this for users.