dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.92k stars 1.63k forks source link

[Bug] Runtime Error in unit_test when using versioned models #10623

Open marcilj opened 2 months ago

marcilj commented 2 months ago

Is this a new bug in dbt-core?

Current Behavior

The problem I'm having

DBT Unit Test is not able to find the version 2 of my model.

Let's take an example of a model I'm trying to test.

with

source as (

    select
        *
    from {{ ref('model') }}

)

, model_with_version as (

    select
        col1
        , col2

    from {{ ref('model_with_version', v=2) }}

)

select * from source

(This is an example, I know model_with_version is not selected in any way in this query)

With this unit test

unit_tests:
  - name: filter_test
    model: my_model_name
    overrides:
      macros:
        is_incremental: false
    given:
      - input: ref('model_with_version')
        rows:
          - {col1: test, col2: test2}
          - {col1: test2, col2: test2}
    expect:
      rows:
          - {col1: test, col2: test2}

If I run the command dbt test --select "my_model_name,test_type:unit" dbt will return

  Unit_Test 'filter_test` depends on a node named `model_with_version` with version '2' which was not found.

I've make sure the version 2 exist. I can easily run dbt run -m model_with_version.v2 and it run successfully.

Note that the latest version is currently version 1, in my YAML file.

I've also tried to add the pin version to the ref in the unit test yml file. Like this - input: ref('model_with_version', v=2) But ended up with this error.

  An error occurred during execution of unit test 'filter_test'. There may be an error in the unit test definition: check the data types.
   Database Error
    002003 (42S02): SQL compilation error:
    Object '__DBT__CTE__MODEL_WITH_VERSION_V2' does not exist or not authorized.

The context of why I'm trying to do this

I'm trying to configure a unit test for a model that uses DBT version.

What I've already tried

  1. Configuring the models version in the input.
  2. Trying without version

Some example code or error messages

  Unit_Test 'filter_test` depends on a node named `model_with_version` with version '2' which was not found.
  An error occurred during execution of unit test 'filter_test'. There may be an error in the unit test definition: check the data types.
   Database Error
    002003 (42S02): SQL compilation error:
    Object '__DBT__CTE__MODEL_WITH_VERSION_V2' does not exist or not authorized.

More Informations

If I configure the input as - input: ref('model_with_version') then in the query I can see that the query generate the static inputs

__dbt__cte__model_with_version

but my code want to select from

    select
        address
        , validator_commission_rate

    from __dbt__cte__model_with_version_v2

Which fails.

Expected Behavior

I shouldn't receive any errors as the __DBT__CTE__STG_FIGMENT_VALIDATOR_REPOSITORY_V2 CTE should be created instead of __DBT__CTE__STG_FIGMENT_VALIDATOR_REPOSITORY

Steps To Reproduce

  1. Create 2 version of a single models.
  2. In my scenario the version 1 is the latest_version.
  3. In my scenario v1 has an alias with the same name, but WITHOUT the version ex alias:model_with_version. (not alias:model_with_version_v1)
  4. Version 2 is called model_with_version_v2.
  5. Create a yml for the unit_tests.
  6. add the required inputs. In my situation I have 2. (One without version and one referencing version 2 of my model)
  7. The input is configure like this - input: ref('model_with_version', v='2')
  8. This will generate the error
    An error occurred during execution of unit test 'my_test'. There may be an error in the unit test definition: check the data types.
    Database Error
    002003 (42S02): SQL compilation error:
    Object '__DBT__CTE__MODEL_WITH_VERSION_V2' does not exist or not authorized.
  9. The reason behind this error, when you look at the SQL query that was runned is that the CTE that DBT creates to inject the inputs values is name __DBT__CTE__MODEL_WITH_VERSION, but the one in the actual query that select * from has the version number after it __DBT__CTE__MODEL_WITH_VERSION_V2.
  10. Since select * from __DBT__CTE__MODEL_WITH_VERSION_V2 doesn't exist because the CTE that was created is called __DBT__CTE__MODEL_WITH_VERSION this break the test.
  11. I've tried with DBT 1.8.0 and 1.8.5 and got the same error.

Relevant log output

An error occurred during execution of unit test 'my_test'. There may be an error in the unit test definition: check the data types.
   Database Error
    002003 (42S02): SQL compilation error:
    Object '__DBT__CTE__MODEL_WITH_VERSION_V2' does not exist or not authorized.

Environment

- OS: MacOS (Intel) Sonoma 14.6.1. 
- Python:Python 3.9.19
- dbt:1.8.0 and 1.8.5

Which database adapter are you using with dbt?

snowflake

Additional Context

No response

dbeatty10 commented 2 months ago

Thanks for reaching out @marcilj !

Does this cover the scenario that you're describing? https://github.com/dbt-labs/dbt-core/issues/10528#issuecomment-2277939001

marcilj commented 2 months ago

Yes @dbeatty10 I think the example you provided covered the scenario I'm describing except that in my scenario the latest_version: 1. But I tried changing it and got the same error.

There seems to be a little difference from what I'm experiencing vs what's describe in the issue #10528.

The error I get is linked to the generated SQL fixture code from the input which doesn't include the version prefix _V2 when generating the inputs values, but does include the version prefix in the select statement.

with

 __dbt__cte__my_model as ( -- (Doesn't include V2)

-- Fixture for __dbt__cte__my_model

 )
 , my_model_cte as (

    select
      col1

    from __dbt__cte__my_model_v2 -- (Inlcudes v2)

)

Hope that helps.

Thank you very much for the quick response btw.

dbeatty10 commented 2 months ago

Thanks @marcilj, that does help 👍

See below for a slightly simplified reproducible example ("reprex"). Could you try it out and verify that it replicates the issue you are seeing?

And then can you try adding - input: ref('model_with_version', v=2) to models/_unit_tests.yml? Specifying the version worked for me in this particular example, so I'm curious if it works for you or not.

### Reprex Create all of these files: `models/_unit_tests.yml` ```yaml unit_tests: - name: filter_test model: my_model_name given: - input: ref('model_with_version', v=2) rows: - {id: 2} expect: rows: - {id: 2} ``` `models/_models.yml` ```yaml models: - name: model_with_version latest_version: 1 versions: - v: 1 columns: - include: all exclude: [added_column] - v: 2 columns: - include: all ``` `models/model_with_version_v1.sql` ```sql select 1 as id ``` `models/model_with_version_v2.sql` ```sql select 1 as id, 2 as added_column ``` `models/my_model_name.sql` ```sql select * from {{ ref('model_with_version', v=2) }} ``` Then run these commands: ```shell dbt run --empty dbt test -s filter_test ``` See this error output: ``` $ dbt test -s filter_test 17:03:52 Running with dbt=1.8.0 17:03:52 Registered adapter: duckdb=1.8.3 17:03:53 Found 3 models, 410 macros, 1 unit test 17:03:53 17:03:53 Concurrency: 1 threads (target='dev') 17:03:53 17:03:53 1 of 1 START unit_test my_model_name::filter_test .............................. [RUN] 17:03:53 While compiling 'filter_test': Found an unpinned reference to versioned model 'model_with_version' in project 'my_project'. Resolving to latest version: model_with_version.v1 A prerelease version 2 is available. It has not yet been marked 'latest' by its maintainer. When that happens, this reference will resolve to model_with_version.v2 instead. Try out v2: {{ ref('my_project', 'model_with_version', v='2') }} Pin to v1: {{ ref('my_project', 'model_with_version', v='1') }} 17:03:53 1 of 1 ERROR my_model_name::filter_test ........................................ [ERROR in 0.04s] 17:03:53 17:03:53 Finished running 1 unit test in 0 hours 0 minutes and 0.24 seconds (0.24s). 17:03:53 17:03:53 Completed with 1 error and 0 warnings: 17:03:53 17:03:53 Compilation Error in unit_test filter_test (models/_unit_tests.yml) Unit_Test 'unit_test.my_project.my_model_name.filter_test' (models/_unit_tests.yml) depends on a node named 'model_with_version' with version '2' which was not found 17:03:53 17:03:53 Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1 ```
marcilj commented 2 months ago

Hey @dbeatty10 I've reproduce this example locally, and here's what I got

Example with - input: ref('model_with_version', v=2)

2212420: my_dbt (main*) $ dbt test -s filter_test
13:42:09  Running with dbt=1.8.0
13:42:10  Registered adapter: snowflake=1.8.3
13:42:13  Found 723 models, 203 data tests, 269 sources, 923 macros, 2 groups, 1 unit test
13:42:13  
13:42:36  Concurrency: 1 threads (target='dev')
13:42:36  
13:42:36  1 of 1 START unit_test my_model_name::filter_test .............................. [RUN]
13:42:36  1 of 1 ERROR my_model_name::filter_test ........................................ [ERROR in 0.48s]
13:42:36  
13:42:36  Finished running 1 unit test in 0 hours 0 minutes and 23.83 seconds (23.83s).
13:42:37  
13:42:37  Completed with 1 error and 0 warnings:
13:42:37  
13:42:37    Runtime Error in unit_test filter_test (models/unit_test.yml)
  An error occurred during execution of unit test 'filter_test'. There may be an error in the unit test definition: check the data types.
   Database Error
    002003 (42S02): SQL compilation error:
    Object '__DBT__CTE__MODEL_WITH_VERSION_V2' does not exist or not authorized.
13:42:37  
13:42:37  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

Also here's the query on Snowflake that was generated and failed.

create or replace temporary table DBT_DB.DBT_JACOBMARCIL_None.filter_test__dbt_tmp
         as
        (select * from (
        with __dbt__cte__model_with_version as (

-- Fixture for model_with_version
select 

        try_cast('2' as NUMBER(1,0))
     as id, try_cast(null as NUMBER(1,0)) as added_column
) select *
from __dbt__cte__model_with_version_v2
    ) as __dbt_sbq
    where false
    limit 0

        );

Example with - input: ref('model_with_version')

my_dbt (main*) $ dbt test -s filter_test
13:46:53  Running with dbt=1.8.0
13:46:54  Registered adapter: snowflake=1.8.3
13:46:57  Found 723 models, 203 data tests, 269 sources, 923 macros, 2 groups, 1 unit test
13:46:57  
13:47:23  Concurrency: 1 threads (target='dev')
13:47:23  
13:47:23  1 of 1 START unit_test my_model_name::filter_test .............................. [RUN]
13:47:23  While compiling 'filter_test':
Found an unpinned reference to versioned model 'model_with_version' in project 'data_team_dbt'.
Resolving to latest version: model_with_version.v1
A prerelease version 2 is available. It has not yet been marked 'latest' by its maintainer.
When that happens, this reference will resolve to model_with_version.v2 instead.

  Try out v2: {{ ref('data_team_dbt', 'model_with_version', v='2') }}
  Pin to  v1: {{ ref('data_team_dbt', 'model_with_version', v='1') }}

13:47:23  1 of 1 ERROR my_model_name::filter_test ........................................ [ERROR in 0.05s]
13:47:23  
13:47:23  Finished running 1 unit test in 0 hours 0 minutes and 25.45 seconds (25.45s).
13:47:24  
13:47:24  Completed with 1 error and 0 warnings:
13:47:24  
13:47:24    Compilation Error in unit_test filter_test (models/unit_test.yml)
  Unit_Test 'unit_test.data_team_dbt.my_model_name.filter_test' (models/agoric/core/rewards/unit_test.yml) depends on a node named 'model_with_version' with version '2' which was not found
13:47:24  
13:47:24  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

Final test

I've also tied to change all reference from ref('model_with_version', v=2) to ref('model_with_version', v='2') but got the __DBT__CTE__MODEL_WITH_VERSION_V2 does not exist or not authorized.

More info

Tested with

dbt --version
Core:
  - installed: 1.8.0
  - latest:    1.8.6 - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - snowflake: 1.8.3 - Up to date!
marcilj commented 2 months ago

Update.

Did upgrade to

dbt --version
Core:
  - installed: 1.8.6
  - latest:    1.8.6 - Up to date!

Plugins:
  - snowflake: 1.8.3 - Up to date!

And got a successful runs for inputs

and - input: ref('model_with_version') gives me the following error, which I think can be expected from DBT.

dbt test -s filter_test
14:00:44  Running with dbt=1.8.6
14:00:45  Registered adapter: snowflake=1.8.3
14:00:48  Found 723 models, 203 data tests, 269 sources, 923 macros, 2 groups, 1 unit test
14:00:48  
14:01:23  Concurrency: 1 threads (target='dev')
14:01:23  
14:01:23  1 of 1 START unit_test my_model_name::filter_test .............................. [RUN]
14:01:23  While compiling 'filter_test':
Found an unpinned reference to versioned model 'model_with_version' in project 'data_team_dbt'.
Resolving to latest version: model_with_version.v1
A prerelease version 2 is available. It has not yet been marked 'latest' by its maintainer.
When that happens, this reference will resolve to model_with_version.v2 instead.

  Try out v2: {{ ref('data_team_dbt', 'model_with_version', v='2') }}
  Pin to  v1: {{ ref('data_team_dbt', 'model_with_version', v='1') }}

14:01:23  1 of 1 ERROR my_model_name::filter_test ........................................ [ERROR in 0.05s]
14:01:23  
14:01:23  Finished running 1 unit test in 0 hours 0 minutes and 35.05 seconds (35.05s).
14:01:24  
14:01:24  Completed with 1 error and 0 warnings:
14:01:24  
14:01:24    Compilation Error in unit_test filter_test (modelsunit_test.yml)
  Unit_Test 'unit_test.data_team_dbt.my_model_name.filter_test' (models/unit_test.yml) depends on a node named 'model_with_version' with version '2' which was not found
14:01:24  
14:01:24  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

So it seems like upgrading to 1.8.6 resolved the issue.

Last test. Updagred to 1.8.5 failed

Seems like the issue was resolved from 1.8.5 to 1.8.6.

Running 1.8.5 gives me the same error as before, with the CTE Fixture not having the correct suffixe V2

dbt test -s filter_test                      
14:03:41  Running with dbt=1.8.5
14:03:43  Registered adapter: snowflake=1.8.3
14:03:43  Unable to do partial parsing because of a version mismatch
14:03:58  Found 723 models, 203 data tests, 269 sources, 923 macros, 2 groups, 1 unit test
14:03:58  
14:04:31  Concurrency: 1 threads (target='dev')
14:04:31  
14:04:31  1 of 1 START unit_test my_model_name::filter_test .............................. [RUN]
14:04:31  1 of 1 ERROR my_model_name::filter_test ........................................ [ERROR in 0.53s]
14:04:31  
14:04:31  Finished running 1 unit test in 0 hours 0 minutes and 33.13 seconds (33.13s).
14:04:32  
14:04:32  Completed with 1 error and 0 warnings:
14:04:32  
14:04:32    Runtime Error in unit_test filter_test (models/unit_test.yml)
  An error occurred during execution of unit test 'filter_test'. There may be an error in the unit test definition: check the data types.
   Database Error
    002003 (42S02): SQL compilation error:
    Object '__DBT__CTE__MODEL_WITH_VERSION_V2' does not exist or not authorized.
14:04:32  
14:04:32  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

Well glad that this was resolved by only upgrading the version to 1.8.6.

Thank for you help @dbeatty10

graciegoheen commented 4 days ago

Sounds like this may be fixed and we can close this out?