dbt-labs / dbt-snowflake

dbt-snowflake contains all of the code enabling dbt to work with Snowflake
https://getdbt.com
Apache License 2.0
297 stars 176 forks source link

[Bug] dbt unit tests raise an error when `QUOTED_IDENTIFIERS_IGNORE_CASE = true` in Snowflake #1160

Open ivanroquev opened 5 months ago

ivanroquev commented 5 months ago

Is this a new bug in dbt-core?

Current Behavior

No significative log output without log-level debug and the error is unexpected.

14:14:13 Finished running 4 data tests, 1 unit test, 1 project hook in 0 hours 0 minutes and 9.68 seconds (9.68s). 14:14:13 Command end result 14:14:13
14:14:13 Completed with 1 error and 0 warnings: 14:14:13
14:14:13 'actual_or_expected' 14:14:13
14:14:13 Done. PASS=4 WARN=0 ERROR=1 SKIP=0 TOTAL=5

Expected Behavior

Complete the test successfully like:

06:57:00 Finished running 4 data tests, 1 unit test, 1 project hook in 0 hours 0 minutes and 11.69 seconds (11.69s). 06:57:00
06:57:00 Completed successfully 06:57:00
06:57:00 Done. PASS=5 WARN=0 ERROR=0 SKIP=0 TOTAL=5

Steps To Reproduce

With this config:

models:
  - name: fact_chargers
    description: This table has basic information about charges
    columns:
      - name: country
        data_tests:
          - accepted_values:
              values: [
              'LU',
              'BE',
          ]
      - name: transaction_id
        data_tests:
          - unique
          - not_null
      - name: external_id
      - name: promotion_id
      - name: is_promotion_applied
        data_tests:
          - accepted_values:
              name: is_promotion_applied_accepted_values
              values: [ true, false ]

unit_tests:
  - name: test_fact_chargers_external_id
    description: "Unit test FACT_CHARGERS"
    model: fact_chargers
    overrides:
      macros:
        # unit test this model in "full refresh" mode
        is_incremental: true
    given:
      - input: source("DW", "fact_chargers")
        rows:
           - {
               transaction_id: transaction1,
               business_date: 2024-06-11,
               external_id: null,
               promotion_id: null,
               is_promotion_applied: false,
             }
           - {
              transaction_id: transaction2,
              business_date: 2024-06-11,
              external_id: null,
              promotion_id: null,
              is_promotion_applied: false,
             }
      - input: source("DW", "fact_promotions")
        rows:
          - {subtype: redemption, channel: emobility, external_id: ext-abcd, business_date: 2024-06-11, coupon_id: abcd1234, ticket_id: transaction1}
          - {subtype: redemption, channel: emobility, external_id: ext-addd, business_date: 2024-06-11, coupon_id: addd1235, ticket_id: transaction2}
          - {subtype: assignation, channel: emobility, external_id: ext-asig, business_date: 2024-06-11, coupon_id: addd1235, ticket_id: transaction1}
          - {subtype: redemption, channel: store, external_id: ext-store-redemption, business_date: 2024-06-11, coupon_id: addd1235, ticket_id: transaction2}
    expect:
      rows:
        - {
            transaction_id: transaction1,
            business_date: 2024-06-11,
            external_id: ext-abcd,
            promotion_id: abcd1234,
            is_promotion_applied: true,
          }
        - {
            transaction_id: transaction2,
            business_date: 2024-06-11,
            external_id: ext-addd,
            promotion_id: addd1235,
            is_promotion_applied: true,
          }

Running the test like that:

dbt test --select fact_chargers -t stg --vars '{"from_date_key": "2024-06-10", "to_date_key": "2024-06-11"}'

With this output:

+--------------+-------------+-----------+------------+--------------------+------------------+
|TRANSACTION_ID|BUSINESS_DATE|EXTERNAL_ID|PROMOTION_ID|IS_PROMOTION_APPLIED|ACTUAL_OR_EXPECTED|
+--------------+-------------+-----------+------------+--------------------+------------------+
|transaction1  |2024-06-11   |ext-abcd   |abcd1234    |true                |actual            |
|transaction2  |2024-06-11   |ext-addd   |addd1235    |true                |actual            |
|transaction1  |2024-06-11   |ext-abcd   |abcd1234    |true                |expected          |
|transaction2  |2024-06-11   |ext-addd   |addd1235    |true                |expected          |
+--------------+-------------+-----------+------------+--------------------+------------------+

Relevant log output

14:14:12  SQL status: SUCCESS 4 in 1.0 seconds
14:14:12  Applying DROP to: DEVELOPMENT.star_model.test_fact_chargers_external_id__dbt_tmp
14:14:12  Using snowflake connection "unit_test.bs.fact_chargers.test_fact_chargers_external_id"
14:14:12  On unit_test.bs.fact_chargers.test_fact_chargers_external_id: /* {"app": "dbt", "dbt_version": "1.8.2", "profile_name": "BS", "target_name": "stg", "node_id": "unit_test.bs.fact_chargers.test_fact_chargers_external_id"} */
drop table if exists DEVELOPMENT.star_model.test_fact_chargers_external_id__dbt_tmp cascade
14:14:12  SQL status: SUCCESS 1 in 0.0 seconds
 -------- SCHEMA ------------- 
| column               | data_type |
| -------------------- | --------- |
| TRANSACTION_ID       | Text      |
| BUSINESS_DATE        | Date      |
| EXTERNAL_ID          | Text      |
| PROMOTION_ID         | Text      |
| IS_PROMOTION_APPLIED | Boolean   |
| ACTUAL_OR_EXPECTED   | Text      |

actual
14:14:12  On unit_test.bs.fact_chargers.test_fact_chargers_external_id: Close
14:14:13  Unhandled error while executing 
'actual_or_expected'
14:14:13  Traceback (most recent call last):
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/dbt/task/base.py", line 368, in safe_run
    result = self.compile_and_execute(manifest, ctx)
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/dbt/task/base.py", line 314, in compile_and_execute
    result = self.run(ctx.node, manifest)
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/dbt/task/base.py", line 415, in run
    return self.execute(compiled_node, manifest)
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/dbt/task/test.py", line 264, in execute
    unit_test_node, unit_test_result = self.execute_unit_test(test, manifest)
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/dbt/task/test.py", line 237, in execute_unit_test
    actual = self._get_unit_test_agate_table(table, "actual")
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/dbt/task/test.py", line 339, in _get_unit_test_agate_table
    unit_test_table = result_table.where(
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/agate/table/where.py", line 21, in where
    if test(row):
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/dbt/task/test.py", line 340, in <lambda>
    lambda row: row["actual_or_expected"] == actual_or_expected
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/agate/mapped_sequence.py", line 88, in __getitem__
    return self.dict()[key]
KeyError: 'actual_or_expected'

14:14:13  5 of 5 ERROR fact_chargers::test_fact_chargers_external_id ..................... [ERROR in 3.62s]

Environment

- OS: MacOS
- Python: 3.9
- dbt: 1.8.2
- dbt-snowflake: 1.8.2

Which database adapter are you using with dbt?

snowflake

Additional Context

I've had to replace the _get_unit_test_agate_table method to parse all dict keys to lowercase like this:

def _get_unit_test_agate_table(self, result_table, actual_or_expected: str):
    import agate
    # Convertir todas las claves de cada fila a lowercase
    def lowercase_keys(row):
        return {k.lower(): v for k, v in row.items()}

    # Convertir todas las filas del result_table a lowercase
    lowercased_rows = [lowercase_keys(row) for row in result_table.rows]

    filtered_rows = [
        row for row in lowercased_rows if row.get("actual_or_expected") == actual_or_expected.lower()
    ]

    # Convertir las filas filtradas de nuevo a un objeto agate.Table
    unit_test_table = agate.Table.from_object(filtered_rows)

    columns = list(unit_test_table.columns.keys())
    columns.remove("actual_or_expected")

    return unit_test_table.select(columns)

The original method is:

def _get_unit_test_agate_table(self, result_table, actual_or_expected: str):
    unit_test_table = result_table.where(
        lambda row: row["actual_or_expected"].lower() == actual_or_expected
    )
    columns = list(unit_test_table.columns.keys())
    columns.remove("actual_or_expected")
    return unit_test_table.select(columns)
jocelyncui22 commented 4 months ago

Hi any updates on this error? Our team is encountering similar issues to this.

jeremyyeo commented 4 months ago

On Snowflake, if param QUOTED_IDENTIFIERS_IGNORE_CASE is true then unit tests will fail (not expected).

# dbt_project.yml
name: my_dbt_project
profile: all
config-version: 2
version: "1.0.0"

models:
   my_dbt_project:
      +materialized: table

on-run-start: "{{ check_param() }}"

# models/tests.yml
unit_tests:
  - name: test_is_valid
    model: bar
    given:
      - input: ref('foo')
        rows:
          - {id: 1}
    expect:
      rows: 
          - {id: 2}
-- models/foo.sql
select 1 id

-- models/bar.sql
select id * 2 as id from {{ ref('foo') }}

-- macros/check_param.sql
{% macro check_param() %}
    {% set r = run_query("show parameters like 'QUOTED_IDENTIFIERS_IGNORE_CASE' in account;") %}
    {% if execute %}
    {% do r.print_table() %}
    {% endif %}
{% endmacro %}

(1) First make sure param is set to false in Snowflake:

image

(2) Build:

$ dbt --debug build

03:35:14  Running 1 on-run-start hook
03:35:14  Using snowflake connection "master"
03:35:14  On master: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "connection_name": "master"} */
show parameters like 'QUOTED_IDENTIFIERS_IGNORE_CASE' in account;
03:35:14  Opening a new connection, currently in state init
03:35:16  SQL status: SUCCESS 1 in 3.0 seconds
| key                  | value | default | level   | description          | type    |
| -------------------- | ----- | ------- | ------- | -------------------- | ------- |
| QUOTED_IDENTIFIER... | false | false   | ACCOUNT | If true, the case... | BOOLEAN |
03:35:16  Writing injected SQL for node "operation.my_dbt_project.my_dbt_project-on-run-start-0"
03:35:16  1 of 1 START hook: my_dbt_project.on-run-start.0 ............................... [RUN]
03:35:16  1 of 1 OK hook: my_dbt_project.on-run-start.0 .................................. [OK in 0.00s]
03:35:16  
03:35:16  On master: Close
03:35:17  Concurrency: 1 threads (target='sf-eu')
03:35:17  
03:35:17  Began running node model.my_dbt_project.foo
03:35:17  1 of 3 START sql table model dbt_jyeo.foo ...................................... [RUN]
03:35:17  Re-using an available connection from the pool (formerly list_analytics_dbt_jyeo, now model.my_dbt_project.foo)
03:35:17  Began compiling node model.my_dbt_project.foo
03:35:17  Writing injected SQL for node "model.my_dbt_project.foo"
03:35:17  Began executing node model.my_dbt_project.foo
03:35:17  Writing runtime sql for node "model.my_dbt_project.foo"
03:35:17  Using snowflake connection "model.my_dbt_project.foo"
03:35:17  On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "node_id": "model.my_dbt_project.foo"} */
create or replace transient table analytics.dbt_jyeo.foo
         as
        (select 1 id
        );
03:35:17  Opening a new connection, currently in state closed
03:35:20  SQL status: SUCCESS 1 in 3.0 seconds
03:35:20  On model.my_dbt_project.foo: Close
03:35:21  Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '72b6ef92-cde4-4239-bb45-03aff682b5b5', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x155241c10>]}
03:35:21  1 of 3 OK created sql table model dbt_jyeo.foo ................................. [SUCCESS 1 in 4.31s]
03:35:21  Finished running node model.my_dbt_project.foo
03:35:21  Began running node unit_test.my_dbt_project.bar.test_is_valid
03:35:21  2 of 3 START unit_test bar::test_is_valid ...................................... [RUN]
03:35:21  Re-using an available connection from the pool (formerly model.my_dbt_project.foo, now unit_test.my_dbt_project.bar.test_is_valid)
03:35:21  Began compiling node unit_test.my_dbt_project.bar.test_is_valid
03:35:21  Began executing node unit_test.my_dbt_project.bar.test_is_valid
03:35:21  Using snowflake connection "unit_test.my_dbt_project.bar.test_is_valid"
03:35:21  On unit_test.my_dbt_project.bar.test_is_valid: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "node_id": "unit_test.my_dbt_project.bar.test_is_valid"} */
describe table analytics.dbt_jyeo.foo
03:35:21  Opening a new connection, currently in state closed
03:35:24  SQL status: SUCCESS 1 in 2.0 seconds
03:35:24  Writing injected SQL for node "unit_test.my_dbt_project.bar.test_is_valid"
03:35:24  Writing injected SQL for node "unit_test.my_dbt_project.bar.test_is_valid"
03:35:24  Using snowflake connection "unit_test.my_dbt_project.bar.test_is_valid"
03:35:24  On unit_test.my_dbt_project.bar.test_is_valid: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "node_id": "unit_test.my_dbt_project.bar.test_is_valid"} */
create or replace temporary table analytics.dbt_jyeo.test_is_valid__dbt_tmp
         as
        (select * from (
        with __dbt__cte__foo as (

-- Fixture for foo
select 

        try_cast('1' as NUMBER(1,0))
     as id
) select id * 2 as id from __dbt__cte__foo
    ) as __dbt_sbq
    where false
    limit 0

        );
03:35:25  SQL status: SUCCESS 1 in 1.0 seconds
03:35:25  Using snowflake connection "unit_test.my_dbt_project.bar.test_is_valid"
03:35:25  On unit_test.my_dbt_project.bar.test_is_valid: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "node_id": "unit_test.my_dbt_project.bar.test_is_valid"} */
describe table analytics.dbt_jyeo.test_is_valid__dbt_tmp
03:35:25  SQL status: SUCCESS 1 in 0.0 seconds
03:35:25  Writing runtime sql for node "unit_test.my_dbt_project.bar.test_is_valid"
03:35:25  Using snowflake connection "unit_test.my_dbt_project.bar.test_is_valid"
03:35:25  On unit_test.my_dbt_project.bar.test_is_valid: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "node_id": "unit_test.my_dbt_project.bar.test_is_valid"} */
-- Build actual result given inputs
with dbt_internal_unit_test_actual as (
  select
    id, 'actual' as "actual_or_expected"
  from (
    with __dbt__cte__foo as (

-- Fixture for foo
select 

        try_cast('1' as NUMBER(1,0))
     as id
) select id * 2 as id from __dbt__cte__foo
  ) _dbt_internal_unit_test_actual
),
-- Build expected result
dbt_internal_unit_test_expected as (
  select
    id, 'expected' as "actual_or_expected"
  from (
    select 

        try_cast('2' as NUMBER(2,0))
     as id
  ) _dbt_internal_unit_test_expected
)
-- Union actual and expected results
select * from dbt_internal_unit_test_actual
union all
select * from dbt_internal_unit_test_expected
03:35:26  SQL status: SUCCESS 2 in 1.0 seconds
03:35:26  Applying DROP to: analytics.dbt_jyeo.test_is_valid__dbt_tmp
03:35:26  Using snowflake connection "unit_test.my_dbt_project.bar.test_is_valid"
03:35:26  On unit_test.my_dbt_project.bar.test_is_valid: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "node_id": "unit_test.my_dbt_project.bar.test_is_valid"} */
drop table if exists analytics.dbt_jyeo.test_is_valid__dbt_tmp cascade
03:35:26  SQL status: SUCCESS 1 in 0.0 seconds
03:35:26  On unit_test.my_dbt_project.bar.test_is_valid: Close
03:35:27  2 of 3 PASS bar::test_is_valid ................................................. [PASS in 5.82s]
03:35:27  Finished running node unit_test.my_dbt_project.bar.test_is_valid
03:35:27  Began running node model.my_dbt_project.bar
03:35:27  3 of 3 START sql table model dbt_jyeo.bar ...................................... [RUN]
03:35:27  Re-using an available connection from the pool (formerly unit_test.my_dbt_project.bar.test_is_valid, now model.my_dbt_project.bar)
03:35:27  Began compiling node model.my_dbt_project.bar
03:35:27  Writing injected SQL for node "model.my_dbt_project.bar"
03:35:27  Began executing node model.my_dbt_project.bar
03:35:27  Writing runtime sql for node "model.my_dbt_project.bar"
03:35:27  Using snowflake connection "model.my_dbt_project.bar"
03:35:27  On model.my_dbt_project.bar: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "node_id": "model.my_dbt_project.bar"} */
create or replace transient table analytics.dbt_jyeo.bar
         as
        (select id * 2 as id from analytics.dbt_jyeo.foo
        );
03:35:27  Opening a new connection, currently in state closed
03:35:32  SQL status: SUCCESS 1 in 5.0 seconds
03:35:32  On model.my_dbt_project.bar: Close
03:35:33  Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '72b6ef92-cde4-4239-bb45-03aff682b5b5', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1546cc8d0>]}
03:35:33  3 of 3 OK created sql table model dbt_jyeo.bar ................................. [SUCCESS 1 in 5.85s]
  1. Set the param to true:

image

  1. Build again:
$ dbt --debug build
03:36:50  Running 1 on-run-start hook
03:36:50  Using snowflake connection "master"
03:36:50  On master: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "connection_name": "master"} */
show parameters like 'QUOTED_IDENTIFIERS_IGNORE_CASE' in account;
03:36:50  Opening a new connection, currently in state init
03:36:53  SQL status: SUCCESS 1 in 2.0 seconds
| key                  | value | default | level   | description          | type    |
| -------------------- | ----- | ------- | ------- | -------------------- | ------- |
| QUOTED_IDENTIFIER... | true  | false   | ACCOUNT | If true, the case... | BOOLEAN |
03:36:53  Writing injected SQL for node "operation.my_dbt_project.my_dbt_project-on-run-start-0"
03:36:53  1 of 1 START hook: my_dbt_project.on-run-start.0 ............................... [RUN]
03:36:53  1 of 1 OK hook: my_dbt_project.on-run-start.0 .................................. [OK in 0.00s]
03:36:53  
03:36:53  On master: Close
03:36:54  Concurrency: 1 threads (target='sf-eu')
03:36:54  
03:36:54  Began running node model.my_dbt_project.foo
03:36:54  1 of 3 START sql table model dbt_jyeo.foo ...................................... [RUN]
03:36:54  Re-using an available connection from the pool (formerly list_analytics, now model.my_dbt_project.foo)
03:36:54  Began compiling node model.my_dbt_project.foo
03:36:54  Writing injected SQL for node "model.my_dbt_project.foo"
03:36:54  Began executing node model.my_dbt_project.foo
03:36:54  Writing runtime sql for node "model.my_dbt_project.foo"
03:36:54  Using snowflake connection "model.my_dbt_project.foo"
03:36:54  On model.my_dbt_project.foo: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "node_id": "model.my_dbt_project.foo"} */
create or replace transient table analytics.dbt_jyeo.foo
         as
        (select 1 id
        );
03:36:54  Opening a new connection, currently in state closed
03:36:58  SQL status: SUCCESS 1 in 4.0 seconds
03:36:58  On model.my_dbt_project.foo: Close
03:36:58  Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '36b5e6c9-f0d6-4016-a8e8-e1737f793857', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x13ff9b650>]}
03:36:58  1 of 3 OK created sql table model dbt_jyeo.foo ................................. [SUCCESS 1 in 4.97s]
03:36:58  Finished running node model.my_dbt_project.foo
03:36:58  Began running node unit_test.my_dbt_project.bar.test_is_valid
03:36:58  2 of 3 START unit_test bar::test_is_valid ...................................... [RUN]
03:36:58  Re-using an available connection from the pool (formerly model.my_dbt_project.foo, now unit_test.my_dbt_project.bar.test_is_valid)
03:36:58  Began compiling node unit_test.my_dbt_project.bar.test_is_valid
03:36:58  Began executing node unit_test.my_dbt_project.bar.test_is_valid
03:36:59  Using snowflake connection "unit_test.my_dbt_project.bar.test_is_valid"
03:36:59  On unit_test.my_dbt_project.bar.test_is_valid: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "node_id": "unit_test.my_dbt_project.bar.test_is_valid"} */
describe table analytics.dbt_jyeo.foo
03:36:59  Opening a new connection, currently in state closed
03:37:03  SQL status: SUCCESS 1 in 4.0 seconds
03:37:03  Writing injected SQL for node "unit_test.my_dbt_project.bar.test_is_valid"
03:37:03  Writing injected SQL for node "unit_test.my_dbt_project.bar.test_is_valid"
03:37:03  Using snowflake connection "unit_test.my_dbt_project.bar.test_is_valid"
03:37:03  On unit_test.my_dbt_project.bar.test_is_valid: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "node_id": "unit_test.my_dbt_project.bar.test_is_valid"} */
create or replace temporary table analytics.dbt_jyeo.test_is_valid__dbt_tmp
         as
        (select * from (
        with __dbt__cte__foo as (

-- Fixture for foo
select 

        try_cast('1' as NUMBER(1,0))
     as id
) select id * 2 as id from __dbt__cte__foo
    ) as __dbt_sbq
    where false
    limit 0

        );
03:37:04  SQL status: SUCCESS 1 in 1.0 seconds
03:37:04  Using snowflake connection "unit_test.my_dbt_project.bar.test_is_valid"
03:37:04  On unit_test.my_dbt_project.bar.test_is_valid: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "node_id": "unit_test.my_dbt_project.bar.test_is_valid"} */
describe table analytics.dbt_jyeo.test_is_valid__dbt_tmp
03:37:04  SQL status: SUCCESS 1 in 0.0 seconds
03:37:04  Writing runtime sql for node "unit_test.my_dbt_project.bar.test_is_valid"
03:37:04  Using snowflake connection "unit_test.my_dbt_project.bar.test_is_valid"
03:37:04  On unit_test.my_dbt_project.bar.test_is_valid: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "node_id": "unit_test.my_dbt_project.bar.test_is_valid"} */
-- Build actual result given inputs
with dbt_internal_unit_test_actual as (
  select
    id, 'actual' as "actual_or_expected"
  from (
    with __dbt__cte__foo as (

-- Fixture for foo
select 

        try_cast('1' as NUMBER(1,0))
     as id
) select id * 2 as id from __dbt__cte__foo
  ) _dbt_internal_unit_test_actual
),
-- Build expected result
dbt_internal_unit_test_expected as (
  select
    id, 'expected' as "actual_or_expected"
  from (
    select 

        try_cast('2' as NUMBER(2,0))
     as id
  ) _dbt_internal_unit_test_expected
)
-- Union actual and expected results
select * from dbt_internal_unit_test_actual
union all
select * from dbt_internal_unit_test_expected
03:37:05  SQL status: SUCCESS 2 in 0.0 seconds
03:37:05  Applying DROP to: analytics.dbt_jyeo.test_is_valid__dbt_tmp
03:37:05  Using snowflake connection "unit_test.my_dbt_project.bar.test_is_valid"
03:37:05  On unit_test.my_dbt_project.bar.test_is_valid: /* {"app": "dbt", "dbt_version": "1.8.3", "profile_name": "all", "target_name": "sf-eu", "node_id": "unit_test.my_dbt_project.bar.test_is_valid"} */
drop table if exists analytics.dbt_jyeo.test_is_valid__dbt_tmp cascade
03:37:05  SQL status: SUCCESS 1 in 1.0 seconds
03:37:05  On unit_test.my_dbt_project.bar.test_is_valid: Close
03:37:06  Unhandled error while executing 
'actual_or_expected'
03:37:06  Traceback (most recent call last):
  File "/Users/jeremy/git/dbt-basic/venv_dbt_1.8.latest/lib/python3.11/site-packages/dbt/task/base.py", line 368, in safe_run
    result = self.compile_and_execute(manifest, ctx)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jeremy/git/dbt-basic/venv_dbt_1.8.latest/lib/python3.11/site-packages/dbt/task/base.py", line 314, in compile_and_execute
    result = self.run(ctx.node, manifest)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jeremy/git/dbt-basic/venv_dbt_1.8.latest/lib/python3.11/site-packages/dbt/task/base.py", line 415, in run
    return self.execute(compiled_node, manifest)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jeremy/git/dbt-basic/venv_dbt_1.8.latest/lib/python3.11/site-packages/dbt/task/test.py", line 264, in execute
    unit_test_node, unit_test_result = self.execute_unit_test(test, manifest)
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jeremy/git/dbt-basic/venv_dbt_1.8.latest/lib/python3.11/site-packages/dbt/task/test.py", line 237, in execute_unit_test
    actual = self._get_unit_test_agate_table(table, "actual")
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jeremy/git/dbt-basic/venv_dbt_1.8.latest/lib/python3.11/site-packages/dbt/task/test.py", line 336, in _get_unit_test_agate_table
    unit_test_table = result_table.where(
                      ^^^^^^^^^^^^^^^^^^^
  File "/Users/jeremy/git/dbt-basic/venv_dbt_1.8.latest/lib/python3.11/site-packages/agate/table/where.py", line 21, in where
    if test(row):
       ^^^^^^^^^
  File "/Users/jeremy/git/dbt-basic/venv_dbt_1.8.latest/lib/python3.11/site-packages/dbt/task/test.py", line 337, in <lambda>
    lambda row: row["actual_or_expected"] == actual_or_expected
                ~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jeremy/git/dbt-basic/venv_dbt_1.8.latest/lib/python3.11/site-packages/agate/mapped_sequence.py", line 88, in __getitem__
    return self.dict()[key]
           ~~~~~~~~~~~^^^^^
KeyError: 'actual_or_expected'

03:37:06  2 of 3 ERROR bar::test_is_valid ................................................ [ERROR in 7.74s]
03:37:06  Finished running node unit_test.my_dbt_project.bar.test_is_valid
03:37:06  Began running node model.my_dbt_project.bar
03:37:06  3 of 3 SKIP relation dbt_jyeo.bar .............................................. [SKIP]
03:37:06  Finished running node model.my_dbt_project.bar
03:37:06  Connection 'master' was properly closed.
03:37:06  Connection 'unit_test.my_dbt_project.bar.test_is_valid' was properly closed.
03:37:06  Connection 'list_analytics_dbt_jyeo' was properly closed.
03:37:06  
03:37:06  Finished running 2 table models, 1 unit test, 1 project hook in 0 hours 0 minutes and 23.85 seconds (23.85s).
03:37:06  Command end result
03:37:06  
03:37:06  Completed with 1 error and 0 warnings:
03:37:06  
03:37:06    'actual_or_expected'
03:37:06  
03:37:06  Done. PASS=1 WARN=0 ERROR=1 SKIP=1 TOTAL=3

The test retrieval query is the same but the resulting "actual_or_expected" column name is lower or uppercased depending on the param.

image

image

When the param is on - the returned result set is uppercased "ACTUAL_OR_EXPECTED" (because the explicit quote to lowercase it is ignored) therefore it is not retrievable by its lower case key "actual_or_expected":

https://github.com/dbt-labs/dbt-core/blob/88b8b10df170d229b442c520209681bb9348065f/core/dbt/task/test.py#L342

Workaround:

Turn off the param in Snowflake for users who don't have direct access to edit the dbt-core source code like ivan above (i.e. pretty much all dbt Cloud users).

See workaround below where we override the default built in macro that does unit test.

ivanroquev commented 4 months ago

Hello, in my case the param is TRUE, thanks for the information, I am not admin and I can't set it to false. +------------------------------+-----+-------+-------+--------------------------------------------------+-------+ |key |value|default|level |description |type | +------------------------------+-----+-------+-------+--------------------------------------------------+-------+ |QUOTED_IDENTIFIERS_IGNORE_CASE|true |false |ACCOUNT|If true, the case of quoted identifiers is ignored|BOOLEAN| +------------------------------+-----+-------+-------+--------------------------------------------------+-------+

i-dalbo commented 4 months ago

I was just testing now the unit test feature, and I stumbled on the same error

10:07:38  1 of 1 START unit_test redacted_model_name  [RUN]
10:07:40  Unhandled error while executing
'actual_or_expected'
10:07:40  1 of 1 ERROR redacted_model_name::redacted_unit_test_name [ERROR in 2.85s]
10:07:41  Finished running 1 unit test, 3 project hooks in 0 hours 0 minutes and 8.33 seconds (8.33s).
10:07:42
10:07:42  Completed with 1 error and 0 warnings:
10:07:42
10:07:42    'actual_or_expected'

This is the system config

- OS: Linux
- Python: 3.9
- dbt: 1.8.3
- dbt-snowflake: 1.8.3

I don't want and don't intend to change any account parameters for the sake of unit testing without evaluating any side effects before, so I would expected this to be solved on the dbt side. Is there any timeline for solving this bug?

sqnico commented 4 months ago

I tried a workaround, by altering the QUOTED_IDENTIFIERS_IGNORE_CASE parameter at the session level. It did not work either.

#dbt_project.yml

on-run-start:
  [
    "ALTER SESSION SET QUOTED_IDENTIFIERS_IGNORE_CASE = FALSE",
    "{{ check_param() }}",
  ]
-- macros/check_params.sql

{% macro check_param() %}
    {% set r = run_query("show parameters like 'QUOTED_IDENTIFIERS_IGNORE_CASE';") %}
    {% if execute %} {% do r.print_table() %} {% endif %}
{% endmacro %}

Log output (same error: Unhandled error while executing 'actual_or_expected'):

18:04:43  Running 2 on-run-start hooks
18:04:43  1 of 2 START hook: hardware_dbt.on-run-start.0 ................................. [RUN]
18:04:43  1 of 2 OK hook: hardware_dbt.on-run-start.0 .................................... [SUCCESS 1 in 0.48s]
| key                  | value | default | level   | description          | type    |
| -------------------- | ----- | ------- | ------- | -------------------- | ------- |
| QUOTED_IDENTIFIER... | false | false   | SESSION | If true, the case... | BOOLEAN |
18:04:44  2 of 2 START hook: hardware_dbt.on-run-start.1 ................................. [RUN]
18:04:44  2 of 2 OK hook: hardware_dbt.on-run-start.1 .................................... [OK in 0.00s]
18:04:44  
18:04:44  Concurrency: 4 threads (target='dev')
18:04:44  
18:04:44  1 of 1 START unit_test fact_orders_hourly::test_is_scac_code_valid ............. [RUN]
18:04:47  Unhandled error while executing 
'actual_or_expected'
18:04:47  1 of 1 ERROR fact_orders_hourly::test_is_scac_code_valid ....................... [ERROR in 3.45s]
18:04:47  
18:04:47  Running 1 on-run-end hook
18:04:48  1 of 1 START hook: hardware_dbt.on-run-end.0 ................................... [RUN]
18:04:48  1 of 1 OK hook: hardware_dbt.on-run-end.0 ...................................... [OK in 0.00s]
18:04:48  
18:04:49  
18:04:49  Finished running 1 unit test, 3 project hooks in 0 hours 0 minutes and 21.64 seconds (21.64s).
18:04:49  
18:04:49  Completed with 1 error and 0 warnings:
18:04:49  
18:04:49    'actual_or_expected'
18:04:49  
18:04:49  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

Config:

- OS: MacOS
- Python: 3.10.10
- dbt-adapters: 1.3.2
- dbt-common: 1.5.0
- dbt-core: 1.8.3
- dbt-snowflake: 1.8.3
jeremyyeo commented 4 months ago

@sqnico i think that's cause of dbt's multi-threaded / connection pooling mechanism or something. When the unit test runs it's a different session from the session that ran the alter session statement. Probably can see that in Snowflake Query History's session id column.

sqnico commented 4 months ago

@jeremyyeo, thank you for your message. I checked, you are correct (see below)

CleanShot 2024-07-18 at 14 23 50

jeremyyeo commented 4 months ago

@sqnico can you add this macro to your project and try again?

-- macros/get_unit_test.sql

{% macro get_unit_test_sql(main_sql, expected_fixture_sql, expected_column_names) -%}
-- For accounts that have this param set to true, we need to make it false for the query.
-- https://github.com/dbt-labs/dbt-snowflake/issues/1160
alter session set QUOTED_IDENTIFIERS_IGNORE_CASE = false;

-- Build actual result given inputs
with dbt_internal_unit_test_actual as (
  select
    {% for expected_column_name in expected_column_names %}{{expected_column_name}}{% if not loop.last -%},{% endif %}{%- endfor -%}, {{ dbt.string_literal("actual") }} as {{ adapter.quote("actual_or_expected") }}
  from (
    {{ main_sql }}
  ) _dbt_internal_unit_test_actual
),
-- Build expected result
dbt_internal_unit_test_expected as (
  select
    {% for expected_column_name in expected_column_names %}{{expected_column_name}}{% if not loop.last -%}, {% endif %}{%- endfor -%}, {{ dbt.string_literal("expected") }} as {{ adapter.quote("actual_or_expected") }}
  from (
    {{ expected_fixture_sql }}
  ) _dbt_internal_unit_test_expected
)
-- Union actual and expected results
select * from dbt_internal_unit_test_actual
union all
select * from dbt_internal_unit_test_expected
{%- endmacro %}

I'm overriding the default one from https://github.com/dbt-labs/dbt-adapters/blob/4b3806818f3831bbd0206a6b22fdcb4fe2a505c5/dbt/include/global_project/macros/materializations/tests/helpers.sql#L23 and adding the session param altering within it - this would make it use the same "session".

sqnico commented 4 months ago

@jeremyyeo Amazing, this is a great workaround! It works! Thank you so much for your help

21:50:45  Running with dbt=1.8.3
21:50:46  Registered adapter: snowflake=1.8.3
21:50:47  Found 109 models, 2 snapshots, 8 seeds, 1 operation, 470 data tests, 93 sources, 22 exposures, 1008 macros, 1 unit test
21:50:47  
21:50:48  Concurrency: 4 threads (target='dev')
21:50:48  
21:50:48  1 of 1 START unit_test fact_orders_hourly::test_is_scac_code_valid ............. [RUN]
21:50:51  1 of 1 PASS fact_orders_hourly::test_is_scac_code_valid ........................ [PASS in 3.08s]
21:50:51  
21:50:51  Running 1 on-run-end hook
21:50:52  1 of 1 START hook: hardware_dbt.on-run-end.0 ................................... [RUN]
21:50:52  1 of 1 OK hook: hardware_dbt.on-run-end.0 ...................................... [OK in 0.00s]
21:50:52  
21:50:52  
21:50:52  Finished running 1 unit test, 1 project hook in 0 hours 0 minutes and 5.30 seconds (5.30s).
21:50:52  
21:50:52  Completed successfully
21:50:52  
21:50:52  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
ChenyuLInx commented 3 months ago

@aranke mentioned https://github.com/dbt-labs/dbt-snowflake/pull/1107 this is related to fix it here.

aranke commented 3 months ago

I reached out to @dbt-labs/adapters internally to see if they have a preferred solution here.

aranke commented 3 months ago

From offline sync with @mikealfare @colin-rogers-dbt, we determined that this is a Snowflake issue so transferred over.

colin-rogers-dbt commented 2 months ago

Will address as part of #1181