datarootsio / prefect-dbt-flow

prefect integration for running dbt
MIT License
59 stars 7 forks source link

Support for state and defer flags #38

Open jeremy-thomas-roc opened 7 months ago

jeremy-thomas-roc commented 7 months ago

Is there any way to run commands with --state or --defer flags? These are very useful for partial builds.

nicogelders commented 7 months ago

Hey, it currently does not, but I can add it as a feature request. Could you provide me with a minimal example so I can implement it? Or you can create an PR yourself.

jeremy-thomas-roc commented 7 months ago

@nicogelders So when a PR is opened in our dbt repo, these are the commands we run in a Prefect flow to ensure that the PR changes do not break any related models:

dbt run --state <path/to/prod/manifest.json> --select state:modified+ --defer
dbt test --state <path/to/prod/manifest.json> --select state:modified+ --defer

It also only builds models that are different than prod, to save compute and time, since our dbt implementation is 400+ models. When only 1 model gets changed, it makes a many hour build turn into a few minutes.

We have a CI/CD process that builds the manifest.json on any push to main and uploads it to cloud storage, and the flow downloads it to be able to perform the partial build.

Documentation on --defer: https://docs.getdbt.com/reference/node-selection/defer. It is similar in use case to clone, but even more efficient in most cases. dbt cloud uses this for partial builds, afaik.

I presumestate could be implemented in the same manner as select, with a required input parameter as the path to manifest.json, but I have not had the time to look into the source code to validate that assumption. Likewise, I'm not sure how a global flag like defer would be implemented, but only because I haven't had the chance to look through the code.