dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.72k stars 1.48k forks source link

Improved concurrency management with pool slots equivalent #12299

Open erdrix opened 1 year ago

erdrix commented 1 year ago

What's the use case?

We are currently on Airflow, and one of the cool things about managing job concurrency is being able to set up multiple pool slots to a task.

For example, we use [Redshift WLM] (https://docs.aws.amazon.com/redshift/latest/dg/c_workload_mngmt_classification.html) to manage concurrency in Redshift. To mirror this in Airflow, we have an Airflow pool to manage access to this WLM queue.

In some case, we know that some SQL queries need more resources than others. So when these particular queries are executed, we want to have fewer queries running at the same time, allocating more WLM resources to them!

This is where the pool slots are really useful, because we can say in this case: this task will take 3 slots instead of one in the pool, this way it reduces concurrency and then gives more resources to the SQL query!

Ideas of implementation

Some inspiration from Airflow? Or not?

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

alangenfeld commented 1 year ago

Effectively a dupe of https://github.com/dagster-io/dagster/issues/12470 which has more thumbs up so closing this in favor of that.

AntonMaxen commented 3 months ago

Effectively a dupe of #12470 which has more thumbs up so closing this in favor of that.

Is it really a dupe though, i have checked both issues and they are asking about different things. This one is talking about being able to define how many slots of a pool a op will allocate, where the other one talks about global concurrency limit across runs using tags, without any weighting.

I have a use-case where we have a heavy op that should be the only op running. but we still want to be able to run multiple smaller ops concurrently that use fewer resources, when the heavier op isn't queued.