Closed graciegoheen closed 2 weeks ago
This seems like a decent deal-breaker at the moment - unless I'm facing some other config issue.
To put some numbers in perspectives (on Snowflake, based on a X-Small warehouse):
Hoping we can get this one prioritized 😊
Is this a new bug in dbt-core?
Current Behavior
Cartesian Join based deletion is causing data spilling to disk which heavily bogs down performance
The delete statement looks like:
But we are not doing anything with
my_model__dbt_tmp
in thewhere
clause.We can simplify this logic and improve the performance, by instead doing:
One advantage of microbatch is that we know in advance the exact boundaries of every batch (time range, cf. "static"
insert_overwrite
).In a world where we support "microbatch merge" models (= update batches by upserting on
unique_key
, rather than full batch replacement), then we would want to join (using) based onunique_key
match, like so:But this shouldn't be the default assumption.
Expected Behavior
We should delete this line.
Steps To Reproduce
See here.
Relevant log output
No response
Environment
Which database adapter are you using with dbt?
snowflake
Additional Context
No response