dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.92k stars 1.63k forks source link

[Bug] dbt unit testing does not supported quoted columns #10748

Closed cboethigtrellance closed 1 month ago

cboethigtrellance commented 1 month ago

Is this a new bug in dbt-core?

Current Behavior

While running my dbt unit tests for one of my models, it fails to compile. I am running this test on a table that has quoted columns (Bad practice I know). As shown in the output, it does not enforce quoted column names in my models.yml file, therefore trying to select non quoted columns that have spaces and special characters.

I have searched the documentation for any properties relating to quotes but could not find any here: https://docs.getdbt.com/reference/resource-properties/unit-tests

Here is my unit test:

unit_tests:
  - name: load_current_branches
    description: loads branches with the current row flag of 'Y'
    model: cu_branch
    overrides:
      macros:
        is_incremental: true
    given:
      - input: ref('branch')
        rows:
          - { branch_id: 1, branch_address_line_1: '123 st', branch_address_line_2: '', branch_name: 'branch 1', branch_postal_code: '1234', current_row_flag: 'N' }
          - { branch_id: 1, branch_address_line_1: '456 st', branch_address_line_2: '', branch_name: 'branch 1', branch_postal_code: '1234', current_row_flag: 'Y' }
    expect:
      format: sql
      rows: |
        select 
        '1' as "Branch Id",
        '456 st' as "Branch Address Line 1",
        '' as "Branch Address Line 2",
        'branch 1' as "Branch Name",
        '1234' as "Branch Postal Code"

Here is the models.yml file:

version: 2
models:
  - name: cu_branch
    config:
      contract:
        enforced: true
    columns:
      - name: Branch Id
        quote: true
        data_type: varchar(50)
        constraints:
          - type: not_null
      - name: Branch Address Line 1
        quote: true
        data_type: varchar(250)
      - name: Branch Address Line 2
        quote: true
        data_type: varchar(250)
      - name: Branch Name
        quote: true
        data_type: varchar(250)
      - name: Branch Postal Code
        quote: true
        data_type: varchar(15)

Expected Behavior

The test should pass however it won't even compile correctly

Steps To Reproduce

  1. run: dbt test -s load_current_branches

Relevant log output

CLI error:

An error occurred during execution of unit test 'load_current_branches'. There may be an error in the unit test definition: check the data types.
   Database Error
    001003 (42000): SQL compilation error:
    syntax error line 3 at position 29 unexpected 'Line'.
    syntax error line 3 at position 43 unexpected 'Address'.
    syntax error line 3 at position 65 unexpected 'Name'.
    syntax error line 3 at position 77 unexpected 'Postal'.
    syntax error line 3 at position 96 unexpected 'System'.

Compiled sql:

-- Build actual result given inputs
with dbt_internal_unit_test_actual as (
  select
    Branch Id,Branch Address Line 1,Branch Address Line 2,Branch Name,Branch Postal Code, 'actual' as "actual_or_expected"
  from (

    with  __dbt__cte__branch as (

-- Fixture for branch
select 

        try_cast('1' as NUMBER(38,0))
     as branch_id, try_cast(null as DATE) as branch_row_begin_date, 

        try_cast('branch 1' as character varying(16777216))
     as branch_name, try_cast(null as DATE) as row_end_date, 

        try_cast('123 st' as character varying(16777216))
     as branch_address_line_1, 

        try_cast('' as character varying(16777216))
     as branch_address_line_2, 

        try_cast('1234' as character varying(16777216))
     as branch_postal_code, try_cast(null as NUMBER(38,0)), 

        try_cast('N' as character varying(1))
     as current_row_flag
union all
select 

        try_cast('1' as NUMBER(38,0))
     as branch_id, try_cast(null as DATE) as branch_row_begin_date, 

        try_cast('branch 1' as character varying(16777216))
     as branch_name, try_cast(null as DATE) as row_end_date, 

        try_cast('456 st' as character varying(16777216))
     as branch_address_line_1, 

        try_cast('' as character varying(16777216))
     as branch_address_line_2, 

        try_cast('1234' as character varying(16777216))
     as branch_postal_code, 

        try_cast('Y' as character varying(1))
     as current_row_flag
), branches as (
        select
            branch_id::varchar as branch_id,
            branch_address_line_1,
            branch_address_line_2,
            branch_name,
            branch_postal_code,
            source_system_code,
            tenant_id
        from __dbt__cte__branch
        where current_row_flag = 'Y'
    )

-- noqa: disable=L057
select distinct
    branch_id as "Branch Id",
    branch_address_line_1 as "Branch Address Line 1",
    branch_address_line_2 as "Branch Address Line 2",
    branch_name as "Branch Name",
    branch_postal_code as "Branch Postal Code",
from branches
  ) _dbt_internal_unit_test_actual
),
-- Build expected result
dbt_internal_unit_test_expected as (
  select
    Branch Id, Branch Address Line 1, Branch Address Line 2, Branch Name, Branch Postal Code, 'expected' as "actual_or_expected"
  from (
    select 
'1' as "Branch Id",
'456 st' as "Branch Address Line 1",
'' as "Branch Address Line 2",
'branch 1' as "Branch Name",
'1234' as "Branch Postal Code"
  ) _dbt_internal_unit_test_expected
)
-- Union actual and expected results
select * from dbt_internal_unit_test_actual
union all
select * from dbt_internal_unit_test_expected
    syntax error line 3 at position 115 unexpected 'Id'.
    syntax error line 3 at position 127 unexpected 'Value'.
    syntax error line 3 at position 134 unexpected ''actual''.
    syntax error line 3 at position 146 unexpected '"actual_or_expected"'.
    syntax error line 4 at position 2 unexpected 'from'.

Environment

- OS:
- Python: 3.9.10
- dbt: 1.8.5
- dbt-snowflake: 1.8.0

Which database adapter are you using with dbt?

snowflake

Additional Context

No response

dbeatty10 commented 1 month ago

Thanks for raising this issue @cboethigtrellance !

I think the root cause is that unit tests are not taking the quote configuration into account.

Reprex

Create these files:

models/my_model.sql

select 1 as "Branch Id"

models/_properties.yaml

models:
  - name: my_model
    columns:
      - name: Branch Id
        quote: true
        tests:
          - not_null

unit_tests:
  - name: dbt_core_10748
    model: my_model
    given: []
    expect:
      rows:
          - {"Branch Id": "1"}

Run these commands:

dbt test -s my_model
dbeatty10 commented 1 month ago

We already have an issue open for this, so I'm going to close this as a duplicate of https://github.com/dbt-labs/dbt-adapters/issues/205.