dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.25k stars 1.53k forks source link

[Bug] Incorrect column data type with incremental contracted model and varchar data_type #10362

Open gshank opened 3 days ago

gshank commented 3 days ago

Current Behavior

A temporary table is created for incremental models when the table already exists. When this temporary table is created and one of the column contracted data_types is "character varying(1)" (size is not significant) the string size is lost in the creation of the temporary relation and instead the column is altered after the temporary table is created like: alter table "xxx"."xxxx"."xxx" add column "__dbt_alter" character varying(256);

The column is created with the wrong string size.

Expected Behavior

The column is created with the string size that's specified in the contract.

Steps to recreate

Create an incremental model, update it with additional rows (causing the creation of a temporary table) and observe the the wrong string size is used.

Additional Context

Reported internally by Lee Bond-Kennedy.

lbk-fishtown commented 3 days ago

I was able to reproduce this for both varchar(1) and character varying(1). In both cases running dbt-postgres 1.6.16 the following occurs on the incremental run.

Both models are the same SQL

with source_data as (

    select 1 as id, 'a' as vchar

)

select *
from source_data

yml

models:
  - name: my_first_dbt_model
    config:
      materialized: incremental
      on_schema_change: append_new_columns
      contract:
        enforced: true
    description: "A starter dbt model"
    columns:
      - name: id
        description: "The primary key for this table"
        data_type: int
      - name: vchar
        description: "Test varying char contract"
        data_type: character varying(1)

  - name: my_other_model
    config:
      materialized: incremental
      on_schema_change: append_new_columns
      contract:
        enforced: true
    columns:
      - name: id
        description: "The primary key for this table"
        data_type: int
      - name: vchar
        description: "Test varying char contract"
        data_type: varchar(1)

logs

^[[0m10:17:46.720110 [debug] [Thread-1 (]: SQL status: SELECT 2 in 0.0 seconds
^[[0m10:17:46.720852 [debug] [Thread-1 (]: Changing col type from character varying(1) to character varying(256) in table database: "lee"
schema: "lee"
identifier: "my_first_dbt_model"

^[[0m10:17:46.724173 [debug] [Thread-1 (]: Using postgres connection "model.pgtest.my_first_dbt_model"
^[[0m10:17:46.724366 [debug] [Thread-1 (]: On model.pgtest.my_first_dbt_model: /* {"app": "dbt", "dbt_version": "1.6.16", "profile_name": "pgtest", "target_name": "dev", "node_id": "model.pgtest.my_first_dbt_model"} */

    alter table "lee"."lee"."my_first_dbt_model" add column "vchar__dbt_alter" character varying(256);
    update "lee"."lee"."my_first_dbt_model" set "vchar__dbt_alter" = "vchar";
    alter table "lee"."lee"."my_first_dbt_model" drop column "vchar" cascade;
    alter table "lee"."lee"."my_first_dbt_model" rename column "vchar__dbt_alter" to "vchar"
^[[0m10:21:12.614573 [debug] [Thread-1 (]: SQL status: SELECT 2 in 0.0 seconds
^[[0m10:21:12.615230 [debug] [Thread-1 (]: Changing col type from character varying(1) to character varying(256) in table database: "lee"
schema: "lee"
identifier: "my_other_model"

^[[0m10:21:12.618429 [debug] [Thread-1 (]: Using postgres connection "model.pgtest.my_other_model"
^[[0m10:21:12.618613 [debug] [Thread-1 (]: On model.pgtest.my_other_model: /* {"app": "dbt", "dbt_version": "1.6.16", "profile_name": "pgtest", "target_name": "dev", "node_id": "model.pgtest.my_other_model"} */

    alter table "lee"."lee"."my_other_model" add column "vchar__dbt_alter" character varying(256);
    update "lee"."lee"."my_other_model" set "vchar__dbt_alter" = "vchar";
    alter table "lee"."lee"."my_other_model" drop column "vchar" cascade;
    alter table "lee"."lee"."my_other_model" rename column "vchar__dbt_alter" to "vchar"