Closed matthias-fbi closed 6 months ago
@matthias-fbi thanks for reporting this along with such a detailed write-up 🤩
This looks to me like the same thing reported in https://github.com/dbt-labs/dbt-bigquery/issues/602, so I'm going to close this one as a duplicate to consolidate the discussion there.
Is this a new bug in dbt-core?
Current Behavior
I have a model A that uses a big table with web events with multiple thousand new rows per minute. I'm using an incremental config like:
I have another model B that aggregates the data from model A. I need to run this model every 5 minutes for the current day for near realtime KPI tracking for some business reasons.
I run this project with
The target/run_results.json does not contain all executions and their bytes_billed. I get the information for model_A, model_B but the MERGE statement into/from the __dbt_tmp table is NOT included.
Per run I should get the following bytes_billed per run in the run_results.json according to the BQ logs: model_A has ~20 Megabytes model_B has ~100 Megabytes MERGE has ~1.29 Gigabyte
but I only have: model_A has ~20 Megabytes model_B has ~100 Megabytes
The one that creates the most bytes_billed is missing in the logs:
Expected Behavior
The resulting target/run_results.json file contains all the executions and their bytes_billed value, including the execution of model_A and model_B and everything that the "incremental" config does in the background with the MERGE and the __dbt_tmp table.
Steps To Reproduce
Relevant log output
Environment
Which database adapter are you using with dbt?
bigquery
Additional Context
No response