[X] I have searched the existing issues, and I could not find an existing issue for this feature
[X] I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion
Describe the feature
Requesting a new on_schema_change strategy for incremental models, "full_refresh". In this instance, if a schema change is detected, the model will run as if the --full-refresh arg had been appended to the dbt run command.
Describe alternatives you've considered
Overloading the should_full_refresh and is_incremental macros in our project (hacky and prone to breaking on an update of core), Modifying our CI/CD process to automatically full refresh all modified models (not desirable, would cause unnecessary refreshes of models where this use case doesn't apply)
Who will this benefit?
Anyone using Incremental Models that don't want to have to manually backfill new/modified columns in a model. In our use case, we're building a model storing the latest revisions of large JSON data structures (around 4 million records). A downstream incremental model then runs a series of json_extracts against the latest revisions of each document to provide meaningful datapoints to our end users. However, new datapoints that may already exist in the JSON are requested all the time, so when we add a new extract, we either have to manually trigger a full refresh of the data for that specific downstream model to populate existing rows, or we choose not to use an incremental model at all and run all extracts on every dag invocation, which will very quickly increase our overhead for the number of extracts done every run.
Are you interested in contributing this feature?
Unsure if able to, would need to talk to my superiors.
Is this your first time submitting a feature request?
Describe the feature
Requesting a new
on_schema_change
strategy for incremental models, "full_refresh". In this instance, if a schema change is detected, the model will run as if the--full-refresh
arg had been appended to the dbt run command.Describe alternatives you've considered
Overloading the
should_full_refresh
andis_incremental
macros in our project (hacky and prone to breaking on an update of core), Modifying our CI/CD process to automatically full refresh all modified models (not desirable, would cause unnecessary refreshes of models where this use case doesn't apply)Who will this benefit?
Anyone using Incremental Models that don't want to have to manually backfill new/modified columns in a model. In our use case, we're building a model storing the latest revisions of large JSON data structures (around 4 million records). A downstream incremental model then runs a series of
json_extracts
against the latest revisions of each document to provide meaningful datapoints to our end users. However, new datapoints that may already exist in the JSON are requested all the time, so when we add a new extract, we either have to manually trigger a full refresh of the data for that specific downstream model to populate existing rows, or we choose not to use an incremental model at all and run all extracts on every dag invocation, which will very quickly increase our overhead for the number of extracts done every run.Are you interested in contributing this feature?
Unsure if able to, would need to talk to my superiors.
Anything else?
Convo in DBT Slack - https://getdbt.slack.com/archives/C50NEBJGG/p1728498654785809