dbt-labs / dbt-postgres

Apache License 2.0
22 stars 12 forks source link

Postgres `__dbt_backup` model names may not be unique when multithreaded #20

Closed tweavers closed 3 weeks ago

tweavers commented 7 months ago

Is this a new bug?

Current Behavior

dbt runs will occasionally fail due to __dbt_backup table names not being unique. This happens when we have very similarly named models (aka share the first 51+ characters) that are being updated at the same time in a multi-threaded run.

These are shown by dbt as database errors

Expected Behavior

__dbt_backup table names should incorporate a unique identifier as part of the suffix to handle similarly named models. The adapter macro has a mechanism in place for this already with the dstring parameter. This paramter is set to true for temporary tables but not backup tables.

Relevant code block

{% macro postgres__make_backup_relation(base_relation, backup_relation_type, suffix) %}
    {% set backup_relation = postgres__make_relation_with_suffix(base_relation, suffix, dstring=False) %}
    {{ return(backup_relation.incorporate(type=backup_relation_type)) }}
{% endmacro %}

Steps To Reproduce

  1. Create a postgres project;
  2. Have either a very large lookup table or one with no data
  3. Create 2 models that reference large table/file from #2
  4. Run dbt multi-threaded: dbt --log-level debug run --select <model_names> --threads 4

Relevant log output

19:38:44  On model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason: /* {"app": "dbt", "dbt_version": "1.7.8", "profile_name": "postgres", "target_name": "local", "node_id": "model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason"} */

  create  table "postgres"."dbt_teddy"."model_with_a_very_long_name_that_we_made_for_some_reas__dbt_tmp"

    as

  (
    select *
from "postgres"."dbt_prod_schema"."some_table_with_nothing"
  );

19:38:44  On model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2: /* {"app": "dbt", "dbt_version": "1.7.8", "profile_name": "postgres", "target_name": "local", "node_id": "model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2"} */

  create  table "postgres"."dbt_teddy"."model_with_a_very_long_name_that_we_made_for_some_reas__dbt_tmp"

    as

  (
    select *
from "postgres"."dbt_prod_schema"."some_table_with_nothing"
  );

19:38:44  SQL status: SELECT 0 in 0.0 seconds
19:38:44  Using postgres connection "model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2"
19:38:44  On model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2: /* {"app": "dbt", "dbt_version": "1.7.8", "profile_name": "postgres", "target_name": "local", "node_id": "model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2"} */
alter table "postgres"."dbt_teddy"."model_with_a_very_long_name_that_we_made_for_some_reason_v2" rename to "model_with_a_very_long_name_that_we_made_for_some_r__dbt_backup"
19:38:44  SQL status: ALTER TABLE in 0.0 seconds
19:38:44  Using postgres connection "model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2"
19:38:44  On model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2: /* {"app": "dbt", "dbt_version": "1.7.8", "profile_name": "postgres", "target_name": "local", "node_id": "model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2"} */
alter table "postgres"."dbt_teddy"."model_with_a_very_long_name_that_we_made_for_some_reas__dbt_tmp" rename to "model_with_a_very_long_name_that_we_made_for_some_reason_v2"
19:38:44  SQL status: ALTER TABLE in 0.0 seconds
19:38:44  On model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2: COMMIT
19:38:44  Using postgres connection "model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2"
19:38:44  On model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2: COMMIT
19:38:44  SQL status: COMMIT in 0.0 seconds
19:38:44  Applying DROP to: "postgres"."dbt_teddy"."model_with_a_very_long_name_that_we_made_for_some_r__dbt_backup"
19:38:44  SQL status: SELECT 0 in 0.0 seconds
19:38:44  Using postgres connection "model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2"
19:38:44  Using postgres connection "model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason"
19:38:44  On model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2: /* {"app": "dbt", "dbt_version": "1.7.8", "profile_name": "postgres", "target_name": "local", "node_id": "model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2"} */
drop table if exists "postgres"."dbt_teddy"."model_with_a_very_long_name_that_we_made_for_some_r__dbt_backup" cascade
19:38:44  On model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason: /* {"app": "dbt", "dbt_version": "1.7.8", "profile_name": "postgres", "target_name": "local", "node_id": "model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason"} */
alter table "postgres"."dbt_teddy"."model_with_a_very_long_name_that_we_made_for_some_reason" rename to "model_with_a_very_long_name_that_we_made_for_some_r__dbt_backup"
19:38:44  SQL status: DROP TABLE in 0.0 seconds
19:38:44  Timing info for model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2 (execute): 13:38:44.801956 => 13:38:44.907258
19:38:44  Postgres adapter: Postgres error: relation "model_with_a_very_long_name_that_we_made_for_some_r__dbt_backup" already exists

19:38:44  On model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2: Close
19:38:44  On model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason: ROLLBACK
19:38:44  2 of 2 OK created sql table model dbt_teddy.model_with_a_very_long_name_that_we_made_for_some_reason_v2  [SELECT 0 in 0.12s]
19:38:44  Finished running node model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason_v2
19:38:44  Postgres adapter: Error running SQL: macro rename_relation
19:38:44  Postgres adapter: Rolling back transaction.
19:38:44  Timing info for model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason (execute): 13:38:44.828108 => 13:38:44.911448
19:38:44  On model.project_name.model_with_a_very_long_name_that_we_made_for_some_reason: Close
19:38:44  Database Error in model model_with_a_very_long_name_that_we_made_for_some_reason (models/marts/test/model_with_a_very_long_name_that_we_made_for_some_reason.sql)
  relation "model_with_a_very_long_name_that_we_made_for_some_r__dbt_backup" already exists
  compiled Code at target/run/project_name/models/marts/test/model_with_a_very_long_name_that_we_made_for_some_reason.sql
19:38:44  1 of 2 ERROR creating sql table model dbt_teddy.model_with_a_very_long_name_that_we_made_for_some_reason  [ERROR in 0.12s]

Environment

- OS: Ubuntu 20.04
- Python:3.10.3
- dbt-postgres: 1.7.8

Additional Context

No response

dbeatty10 commented 7 months ago

Thanks for raising this issue and doing that research @tweavers !

Does your example project consistently work if you add the following macro override to your project in order to set dstring=True?

macros/overrides/make_backup_relation.sql

{% macro postgres__make_backup_relation(base_relation, backup_relation_type, suffix) %}
    {% set backup_relation = postgres__make_relation_with_suffix(base_relation, suffix, dstring=True) %}
    {{ return(backup_relation.incorporate(type=backup_relation_type)) }}
{% endmacro %}
dbeatty10 commented 7 months ago

The following setup was able to yield the same error you reported a majority of the time (but not every single time).

But when I added dstring=True like https://github.com/dbt-labs/dbt-postgres/issues/20#issuecomment-1959626260, then it worked consistently.

### Reprex `models/empty_lookup_table.sql.sql` ```sql {{ config(materialized="table", tags="reprex") }} select 1 as id where 1=0 ``` Then create (4) different models that have the same content and only differ in their file names: 1. `models/model_with_a_very_long_name_that_we_made_for_some_reason_a.sql` 1. `models/model_with_a_very_long_name_that_we_made_for_some_reason_b.sql` 1. `models/model_with_a_very_long_name_that_we_made_for_some_reason_c.sql` 1. `models/model_with_a_very_long_name_that_we_made_for_some_reason_d.sql` Here's the contents to duplicate across those (4) models: ```sql {{ config(materialized="table", tags="reprex") }} -- depends on: -- {{ ref("empty_lookup_table") }} select 1 as id ``` Run this command to get an error: ```shell dbt run --select tag:reprex --threads 4 ```
tweavers commented 7 months ago

@dbeatty10 - appreciate the quick follow up. Confirming that the override macro is working - thank you!.

We're seeing the backup tables now named like this:

rename to "model_with_a_very_long_name_that_we_mad__dbt_backup101508522444"

Interestingly, the __dbt_tmp models aren't appended with the dtstring despite the parameters in postgres__make_temp_relation. I am surprised this doesn't throw an error that the table already exists. I've tested setting dstring to false in a similar manner and they are named the same. Example below and also captured in earlier posted logs

model_with_a_very_long_name_that_we_made_for_some_reas__dbt_tmp

dbeatty10 commented 6 months ago

Here's the only difference between dstring=False and dstring=True:

21c21
< alter table "postgres"."dbt_dbeatty"."model_with_a_very_long_name_that_we_made_for_some_reason_a" rename to "model_with_a_very_long_name_that_we_made_for_some_r__dbt_backup"
---
> alter table "postgres"."dbt_dbeatty"."model_with_a_very_long_name_that_we_made_for_some_reason_a" rename to "model_with_a_very_long_name_that_we_mad__dbt_backup135949342860"
26c26
< drop table if exists "postgres"."dbt_dbeatty"."model_with_a_very_long_name_that_we_made_for_some_r__dbt_backup" cascade
---
> drop table if exists "postgres"."dbt_dbeatty"."model_with_a_very_long_name_that_we_mad__dbt_backup135949342860" cascade

When dstring=False, it looks like the ...__dbt_backup table is created in one transaction and then dropped in a different one. So that could explain a collision between the transactions in the different threads.

Here is a lightly formatted version of the queries for dstring=False from the script below:

BEGIN
create  table "postgres"."dbt_dbeatty"."model_with_a_very_long_name_that_we_made_for_some_reas__dbt_tmp" as  (
    -- depends on:
    --   "postgres"."dbt_dbeatty"."empty_lookup_table"
    select 'a' as id
);
alter table "postgres"."dbt_dbeatty"."model_with_a_very_long_name_that_we_made_for_some_reason_a" rename to "model_with_a_very_long_name_that_we_made_for_some_r__dbt_backup"
alter table "postgres"."dbt_dbeatty"."model_with_a_very_long_name_that_we_made_for_some_reas__dbt_tmp" rename to "model_with_a_very_long_name_that_we_made_for_some_reason_a"
COMMIT
drop table if exists "postgres"."dbt_dbeatty"."model_with_a_very_long_name_that_we_made_for_some_r__dbt_backup" cascade
### Script to show diff Assuming [`jq`](https://jqlang.github.io/jq/download/) is installed: ```shell mkdir -p macros/overrides/ cat << EOF > macros/overrides/make_backup_relation.sql {% macro postgres__make_backup_relation(base_relation, backup_relation_type, suffix) %} {% set backup_relation = postgres__make_relation_with_suffix(base_relation, suffix, dstring=False) %} {{ return(backup_relation.incorporate(type=backup_relation_type)) }} {% endmacro %} EOF rm -rf logs dbt --log-format-file json run --select tag:reprex --threads 4 jq -r -c '.data | select(.node_info.resource_type == "model" and .node_info.node_name == "model_with_a_very_long_name_that_we_made_for_some_reason_a") | select(.sql != null) | .sql' logs/dbt.log | while IFS= read -r line; do echo $line; done > dbt.log.model_with_a_very_long_name_that_we_made_for_some_reason_a.false.sql mkdir -p macros/overrides/ cat << EOF > macros/overrides/make_backup_relation.sql {% macro postgres__make_backup_relation(base_relation, backup_relation_type, suffix) %} {% set backup_relation = postgres__make_relation_with_suffix(base_relation, suffix, dstring=True) %} {{ return(backup_relation.incorporate(type=backup_relation_type)) }} {% endmacro %} EOF rm -rf logs dbt --log-format-file json run --select tag:reprex --threads 4 jq -r -c '.data | select(.node_info.resource_type == "model" and .node_info.node_name == "model_with_a_very_long_name_that_we_made_for_some_reason_a") | select(.sql != null) | .sql' logs/dbt.log | while IFS= read -r line; do echo $line; done > dbt.log.model_with_a_very_long_name_that_we_made_for_some_reason_a.true.sql ``` ```shell diff dbt.log.model_with_a_very_long_name_that_we_made_for_some_reason_a.false.sql dbt.log.model_with_a_very_long_name_that_we_made_for_some_reason_a.true.sql ```
github-actions[bot] commented 4 weeks ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] commented 3 weeks ago

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.