databricks / dbt-databricks

A dbt adapter for Databricks.
https://databricks.com
Apache License 2.0
226 stars 119 forks source link

Set unique table suffix to allow parallel incremental executions #803

Closed xenera-huangxingyi closed 1 month ago

xenera-huangxingyi commented 2 months ago

Describe the feature

dbt-athena's unique_tmp_table_suffix table configuration is highly need in dbt-databricks! link When dbt-databricks creates temporary view, it should be able to replace __dbt_tmp suffix with a unique UUID for incremental models using replace_where An common scenario would be run the same dbt models at the same time with different --vars in order to insert different data partitions into the same table. For example doing data recovery. Without this feature, each dbt runs overwrites each others temporary view which leading to some of the dbt run are actually 'not executed'.

Who will this benefit?

For those who uses repalce_where as incremental strategy. Example Use Case: Run the same incremental model concurrently with different --vars in order to parallelly insert multiple data partitions

dbt run --select fct_not_a_model --vars '{"country": "us", "run_date": "20240101"}'
dbt run --select fct_not_a_model --vars '{"country": "uk", "run_date": "20240101"}'
dbt run --select fct_not_a_model --vars '{"country": "cn", "run_date": "20240101"}'
dbt run --select fct_not_a_model --vars '{"country": "jp", "run_date": "20240101"}'