dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.95k stars 1.63k forks source link

[Feature] unit testing: dbt should tell me why it couldn't get columns of this (the model doesn't yet exist) on incremental models #10844

Open pettersoderlund opened 1 month ago

pettersoderlund commented 1 month ago

Is this your first time submitting a feature request?

Describe the feature

This suggestion is very similar to https://github.com/dbt-labs/dbt-core/issues/10014 with the difference that this is specifically about incremental models where the model itself not yet has been built.

I tried to make an unit test on an incremental model testing the incremental functionality of the model with is_incremental: true where I mocked the data of the current state of the model with - input: this

If the model has never been built in the current profile and I use dbt build the test fails because unit tests run before the models run. I therefore get an error of Not able to get columns for unit test... because the relation doesn't exist.

vscode ➜ /workspaces/projectx/integration_tests (main) $ dbt test -s my_model,test_type:unit
12:17:16  Running with dbt=1.8.1
12:17:18  Registered adapter: bigquery=1.8.1
12:17:19  Found 52 models, 1 operation, 14 data tests, 1 source, 653 macros, 1 unit test
12:17:19  
12:17:19  Concurrency: 10 threads (target='dev')
12:17:19  
12:17:19  1 of 1 START unit_test my_model::my_model_incremental_mode ............... [RUN]
12:17:19  1 of 1 ERROR my_model::my_model_incremental_mode ......................... [ERROR in 0.31s]
12:17:19  
12:17:19  Running 1 on-run-end hook
12:17:20  1 of 1 START hook: projectx_integration_tests.on-run-end.0 ............. [RUN]
12:17:20  1 of 1 OK hook: projectx_integration_tests.on-run-end.0 ................ [OK in 0.00s]
12:17:20  
12:17:20  
12:17:20  Finished running 1 unit test, 1 project hook in 0 hours 0 minutes and 0.89 seconds (0.89s).
12:17:20  
12:17:20  Completed with 1 error and 0 warnings:
12:17:20  
12:17:20    Compilation Error in model my_model (models/my_model/string_values/my_model.yml)
  Not able to get columns for unit test 'my_model' from relation `gcpproject`.`dataset`.`my_model` because the relation doesn't exist

  > in macro get_fixture_sql (macros/unit_test_sql/get_fixture_sql.sql)
  > called by model my_model (models/my_model/string_values/my_model.yml)
12:17:20  
12:17:20  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

This I think will be confusing for users checking out my repository and building it from scratch. It might also be problematic in CI pipelines where the model has never been built before.

Describe alternatives you've considered

I have considered two solutions to this:

  1. Better error message. Refer to that this is an incremental model requiring itself to build before the test can be run (to get the columns of this). This error message could also suggest to run dbt run --empty to create needed schemas for the test.
  2. Trigger run -s my_model --empty if this happens automatically and then run the unit test.
  3. Specification of column data types in schema.yml might not require model to be build to get columns with corresponding data types

Who will this benefit?

Users of the unittest functionality on incremental models

Are you interested in contributing this feature?

No response

Anything else?

No response

sarahjryan commented 4 days ago

This would be amazing, obviously I want test to run in advance of build to check the code works & having it complain / doing extra steps to get this working for incremental is a real PIA