dbt-labs / dbt-postgres

Apache License 2.0
31 stars 12 forks source link

[CT-2957] [Bug] Cannot specify column contract of type UUID in PostgreSQL with DBT 1.5.3 #54

Closed epgui closed 1 year ago

epgui commented 1 year ago

Is this a new bug in dbt-core?

Current Behavior

I have a source table with the following DDL, populated with three records:

CREATE TABLE src_onestream_raw_table.confirm_rule (
    conf_group_key uuid NOT NULL,
    display_order integer NOT NULL,
    display_type integer NOT NULL,
    failure_action integer NOT NULL,
    failure_message character varying NOT NULL,
    rule_formula character varying NOT NULL,
    rule_frequency integer NOT NULL,
    rule_name character varying NOT NULL,
    rule_text character varying NOT NULL,
    time_filter_for_reqt_freq character varying NOT NULL,
    unique_id uuid PRIMARY KEY,
    warning_message character varying NOT NULL,
    xml_data character varying NOT NULL
);
CREATE UNIQUE INDEX confirm_rule_pkey ON src_onestream_raw_table.confirm_rule(unique_id uuid_ops);

I have created the following staging model:

WITH

src AS (
    SELECT * FROM {{ source('onestream_raw_table', 'confirm_rule') }}
),

final AS (
    SELECT
        conf_group_key,
        display_order,
        display_type,
        failure_action,
        failure_message::TEXT,
        rule_formula::TEXT,
        rule_frequency,
        rule_name::TEXT,
        rule_text::TEXT,
        time_filter_for_reqt_freq::TEXT,
        unique_id,
        warning_message::TEXT,
        xml_data::TEXT
    FROM src
)

SELECT * FROM final

I have configured it this way:

version: 2

models:
  # ...
  - name: stg_onestream_raw_table__confirm_rule
    constraints:
      - type: primary_key
        columns: [unique_id]
    columns:
      - name: conf_group_key
        data_type: UUID
      - name: display_order
        data_type: INTEGER
      - name: display_type
        data_type: INTEGER
      - name: failure_action
        data_type: INTEGER
      - name: failure_message
        data_type: TEXT
      - name: rule_formula
        data_type: TEXT
      - name: rule_frequency
        data_type: INTEGER
      - name: rule_name
        data_type: TEXT
      - name: rule_text
        data_type: TEXT
      - name: time_filter_for_reqt_freq
        data_type: TEXT
      - name: unique_id
        data_type: UUID
      - name: warning_message
        data_type: TEXT
      - name: xml_data
        data_type: TEXT
    config:
      contract:
        enforced: true
    # ...

When I run DBT, I get the following error, with no extra context and no other information:

2023-08-10 12:57:36 15:57:36  Unhandled error while executing 
2023-08-10 12:57:36 2950

It looks like 2950 is the oid of the UUID data type in postgres: https://github.com/postgres/postgres/blob/5e0c761d0a13c7b4f7c5de618ac38560d74d74d0/src/include/catalog/pg_type.dat#L403

If I cast all my UUIDs to TEXT, and change my contracts to TEXT instead of UUID, the error disappears. Without the contracts, the model runs just fine and the UUID type works as expected.

Expected Behavior

See previous section

Steps To Reproduce

See previous section

Relevant log output

2023-08-10 12:57:36 15:57:36  Unhandled error while executing 
2023-08-10 12:57:36 2950

No further context or any other information is output with this error. It looks like 2950 is the oid of the UUID data type in postgres: https://github.com/postgres/postgres/blob/5e0c761d0a13c7b4f7c5de618ac38560d74d74d0/src/include/catalog/pg_type.dat#L403

Environment

- Docker image "python:3.10.8" running on a mac
- Python: 3.10.8
- dbt: 1.5.3

Which database adapter are you using with dbt?

postgres

Additional Context

No response

gmontanola commented 1 year ago

I just faced this error! I'll submit a PR ASAP

dbeatty10 commented 1 year ago

Thank you for reporting this @epgui and for raising a PR @gmontanola 🙌

jenna-jordan commented 1 year ago

Just wanted to note that I just ran into this exact error - "Unhandled error while executing" for enforcing a model contract with uuid columns. Example yaml:

    columns:
      - name: payment_batch_id
        data_type: uuid
        description: ""

I also ran into this issue with the money datatype, which is not mentioned in this issue or the PR addressing it (I think)

bruint commented 1 year ago

This PR is important as it solves an error with the snowplow_web package.

With the current adapter, you will receive a 2950 Runtime Error.

Can confirm that this PR resolves the problem.

rlh1994 commented 1 year ago

@dbeatty10 do you know when the PR for this is likely to be merged please? This is an open issue for us and I'd prefer not to have to alter all our packages to put a workaround in for this.

For the record we don't currently use contracts or even specify the column type in the table that is erroring, but just having a uuid type column causes this error.

EDIT: We're getting the issue from the get_column_schema_from_query macro, without having to enforce contracts at all.

dbeatty10 commented 1 year ago

@rlh1994 Thanks for opening https://github.com/dbt-labs/dbt-core/issues/8877! It does a nice job of explaining the general case of unknown data types and laying out options. Since it is more broad in impact, we are likely to prioritize https://github.com/dbt-labs/dbt-core/issues/8877 over https://github.com/dbt-labs/dbt-postgres/issues/54. Not to say that both can't be done, just that we'd likely approach them in that order.