dbt-labs / dbt-bigquery

dbt-bigquery contains all of the code required to make dbt operate on a BigQuery database.
https://github.com/dbt-labs/dbt-bigquery
Apache License 2.0
222 stars 157 forks source link

feat(incremental): optimize 'insert_overwrite' strategy (#1409) #1410

Open AxelThevenot opened 1 day ago

AxelThevenot commented 1 day ago

resolves #1409 docs "N/A"

Problem

The MERGE statement is sub-optimized in BigQuery when it comes to only replace partitions in the 'insert_overwrite' strategy for incremental models

Solution

For the insert_overwrite strategy where we are looking to replace rows at the partition-level, there is a better solution and here is why:

This has been tested at Carrefour which is my company.

  • On this replacement of the MERGE statement it reduces the cost by 50.4% and the elapsed time by 35.2% (slot based and not on demand)
  • On the overall procedure it reduces the cost by 26.1% and the elapsed time by 23.1%

This is wrapped in a transaction to avoid deleting rows if any error occurs.

Checklist