Core idea is that dbt will determine slices of time to break up an insert into multiple statements; we run a replace-where with those slices so that any old data is replaced by the newest version of that data. This makes it much easier for users to back fill, and on failure, only rerun the slices that failed.
I have to cast the column to TIMESTAMP, as if your event_time column is a date, Databricks casts the conditions to date and then it looks like
replace where date >= X and date < X
I also hit an issue with column comments that I think was introduced in dbt-core 1.9.0b2 that I have fixed here.
Checklist
[x] I have run this code in development and it appears to resolve the stated issue
[x] This PR includes tests, or tests are not required/relevant for this PR
[x] I have updated the CHANGELOG.md and added information about my change to the "dbt-databricks next" section.
Resolves #824
Description
Implements the microbatch incremental strategy: https://docs.getdbt.com/docs/build/incremental-microbatch
Core idea is that dbt will determine slices of time to break up an insert into multiple statements; we run a replace-where with those slices so that any old data is replaced by the newest version of that data. This makes it much easier for users to back fill, and on failure, only rerun the slices that failed.
I have to cast the column to TIMESTAMP, as if your event_time column is a date, Databricks casts the conditions to date and then it looks like replace where date >= X and date < X
I also hit an issue with column comments that I think was introduced in dbt-core 1.9.0b2 that I have fixed here.
Checklist
CHANGELOG.md
and added information about my change to the "dbt-databricks next" section.