[X] I have searched the existing issues, and I could not find an existing issue for this feature
[X] I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion
Describe the feature
Problem Statement
dbt runs models in DAG order, which is functionally correct. But there are situations^1 where it would be helpful to have more control over the relative execution order of models within a run. For example: in a run which includes a long-running model with no upstream dependencies but many downstream dependencies it would be helpful to start the long-running model first to minimize total run time.
Proposed Solution
A new execution_order configuration which allows you to specify the relative execution order of selected resources.
At runtime, dbt would:
Determine the set of resources whose dependencies are satisfied (aka "run in DAG order")
Within that set, run the resources ordered by execution_order (nulls last), falling back to whatever is the current ordering logic
Describe alternatives you've considered
Workarounds with which I am familiar:
dbt seems to run models in alphabetical order, so you could rename the long-running model to have an alphabetically-earlier name
...but it feels fragile to rely on this undocumented behavior
Add a --depends on: {{ ref('long_running_model') }} to all other models in the project to force long-running model to run first
...but the other models don't necessarily depend on this model, so it makes the DAG visualization misleading
Who will this benefit?
Folks with long-running models in the middle of their DAGs
Are you interested in contributing this feature?
No
Anything else?
I realize that giving developers some control over execution order is likely controversial and potentially complicated to implement, but I see this as a useful Advanced Feature™ (a la incremental predicates) for those situations where complex DAG runtime is sub-optimal.
Is this your first time submitting a feature request?
Describe the feature
Problem Statement
dbt runs models in DAG order, which is functionally correct. But there are situations^1 where it would be helpful to have more control over the relative execution order of models within a run. For example: in a run which includes a long-running model with no upstream dependencies but many downstream dependencies it would be helpful to start the long-running model first to minimize total run time.
Proposed Solution
A new
execution_order
configuration which allows you to specify the relative execution order of selected resources. At runtime, dbt would:execution_order
(nulls last), falling back to whatever is the current ordering logicDescribe alternatives you've considered
Workarounds with which I am familiar:
--depends on: {{ ref('long_running_model') }}
to all other models in the project to force long-running model to run firstWho will this benefit?
Folks with long-running models in the middle of their DAGs
Are you interested in contributing this feature?
No
Anything else?
I realize that giving developers some control over execution order is likely controversial and potentially complicated to implement, but I see this as a useful Advanced Feature™ (a la incremental predicates) for those situations where complex DAG runtime is sub-optimal.