dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.83k stars 1.62k forks source link

[CT-3398] unit tests should support enabled #9109

Open graciegoheen opened 11 months ago

graciegoheen commented 11 months ago

Housekeeping

Short description

Folks may want to turn unit tests on/off.

In order to do configs we'd have to have model config objects for unit test definitions, create a project object unit_tests dictionary, and merge configs from dbt_project and each individual unit test.

Acceptance criteria

https://docs.getdbt.com/reference/resource-configs/enabled

Unit tests should support the enabled, meta, and tags config:

By resource path in dbt_project.yml:

unit-tests:
  <resource-path>:
    +enabled: ...
    +meta: ...
    +tags: ...

Or in unit test definition yml:

unit-tests:
  - name: test_my_model
    model: my_model
    config:
      enabled: false
      meta: 
        owner: "@doug"
      tags: my_favoirte_unit_tests
    given:
      - input: ref('my_model_a')
        format: csv
    rows:
          - {id: 1, name: gerda}
          - {id: 2, b: michelle} 
    ...

We already support meta and tags, but not yet enabled.

Impact to Other Teams

none

Will backports be required?

no

Context

No response

graciegoheen commented 11 months ago

Notes from estimation:

graciegoheen commented 11 months ago

If you disable unit tests, they won't show up in the manifest.

dbeatty10 commented 5 months ago

enabled can be configured for all the other resource types that I tried out... except for unit tests:

resource type dbt_project.yml properties YAML config() macro
models
snapshots
seeds
sources
exposures
semantic models
analyses (#8061)
singular data tests (#9005)
generic data tests
unit tests (#9109) (#9109)

Legend

✅ Supported ❌ Not supported but could be ➖ Not supported and can't be

### Example files `models/metricflow_time_spine.sql` ```sql {{ config(materialized='table') }} select cast({{ dbt.string_literal("2000-01-01") }} as date) as date_day ``` Note: singular data tests can't be disabled within a YAML file (like `models/_properties.yml`, but either `dbt_project.yml` or `config()` within SQL file are options. `tests/singular_data_test.sql` ```sql -- { { config(enabled=false) }} select * from {{ ref('metricflow_time_spine' )}} where 0=1 ``` `snapshots/my_snapshot.sql` ```sql {% snapshot my_snapshot %} {{ config( target_database=target.database, target_schema=target.schema, unique_key='id', strategy='check', check_cols='all', ) }} select 1 as id {% endsnapshot %} ``` `seeds/my_seed.csv` ```csv id 1 ``` Note: analyses can't be selectively disabled within `dbt_project.yml` -- they inherit the config from `models` within `dbt_project.yml`. `analyses/my_analysis.sql` ```sql -- { { config(enabled=false) }} select 1 as id ``` `models/_properties.yml` ```yaml models: - name: metricflow_time_spine config: enabled: true columns: - name: date_day data_tests: - not_null: name: generic_data_test config: enabled: true sources: - name: my_source tables: - name: my_source_table config: enabled: true seeds: - name: my_seed config: enabled: true snapshots: - name: my_snapshot config: enabled: true analyses: - name: my_analysis config: enabled: true exposures: - name: my_exposure type: dashboard owner: email: exposure_owner@my_company.com config: enabled: true metrics: - name: my_metric label: my_metric type: derived type_params: expr: revenue - cost config: enabled: true semantic_models: - name: my_semantic_model model: ref('metricflow_time_spine') config: enabled: true saved_queries: - name: my_saved_query query_params: metrics: - my_metric config: enabled: true unit_tests: - name: my_unit_test given: [] model: metricflow_time_spine expect: rows: - {date_day: 2000-01-01} # Note: this does NOT have any effect # So need to comment out or delete this unit test if its associated model(s) are disabled config: enabled: true ``` `dbt_project.yml` ```yaml name: "my_project" # Note: also disables analyses! models: my_project: +enabled: true seeds: my_project: +enabled: true snapshots: my_project: +enabled: true sources: my_project: +enabled: true exposures: my_project: +enabled: true metrics: my_project: +enabled: true # Note: inconsistent to have "semantic-models" with a dash, but "data_tests" and "unit_tests" with underscores semantic-models: my_project: +enabled: true # Note: this has no effect -- must configure within properties.yml or config() within the analysis file analyses: my_project: +enabled: true data_tests: my_project: +enabled: true # This has no effect -- there is no way to disable a unit test outside of commenting it out or excluding it from the selection. # It will give the following warning: # [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources. # There are 1 unused configuration paths: # - unit_tests.my_project unit_tests: my_project: +enabled: true ``` Run this command to see which resources are enabled: ```shell dbt list --resource-type all ``` Then change the `enabled` config for one or more resources and re-run the `dbt list` command above to see if it disappears or not.
dbeatty10 commented 5 months ago

Workarounds

Currently, there's a handful of ways to prevent a unit testing from running:

  1. Comment it out within the YAML. i.e. add a leading # to the beginning of each line
  2. Delete the unit test definition entirely
  3. Move the unit test YAML. i.e. move it to a folder that is outside of the model-paths: [directorypath] configured in dbt_project.yml (e.g. disabled_unit_tests)
  4. Add a tag and then --exclude that tag (using the "tag" method)

Recommended workaround

We think option (1) is the easiest:

# you can even leave a comment about why you are commenting it out

# unit_tests:
#   - name: test_that_we_want_to_disable
#     model: stg_orders
#     given: []
#     expect:
#       rows:
#         - {id: 10}
joellabes commented 5 months ago

I need to conditionally enable/disable unit tests in the audit helper repo, because (for example) some use SQL fixtures which have different syntax on different platforms.

I've gone with this, but I do not like it:

unit_tests:
  [...]
    config:
      tags: "{{ 'works_on_my_machine' if (target.type in ['snowflake', 'bigquery']) else 'skip' }}"

And the CircleCI job will have this defined: dbt build --exclude tag:skip

graciegoheen commented 5 months ago

@joellabes sounds like what you're actual after is something like conditional fixtures?

(just an idea)

unit_tests:
  - name: test_my_model
    model: my_model
    given:
      - input: ref('my_model_a')
        format: csv
        fixture: "{{ 'my_model_a_fixture_snow_bq' if (target.type in ['snowflake', 'bigquery']) else 'my_model_a_fixture_other' }}"
    ...
joellabes commented 5 months ago

Ooooh that's a fun idea! In practice I wouldn't want to have to make a dozen or more fixture files though, the fact that I can put it inline in the yaml is pretty compelling to me.

Secondarily, I also worry that the conditional Jinja logic would get horrendous if it wasn't just a binary option. Although I guess I could do {{ target.type }}_fixture_1.sql instead of an if block (were I willing to make files for each of them)

benw-at-birdie commented 4 months ago

Just adding a +1 for this issue. We use enabled: to dynamically turn models on and off depending on the deployment region (not all models are applicable in all geographies).

Applying unit tests to models which are disabled in a particular region causes the DBT compile to fail. This means we're unable to adopt the new unit testing functionality :'(.

dbeatty10 commented 4 months ago

@benw-at-birdie it makes sense to me to add enabled for unit tests because every other resource type has it.

In the meantime, did you try out the workaround using tags, by any chance?

ChenyuLInx commented 4 months ago

We would want to move the disabled unit tests to the disabled dictionary.

benw-at-birdie commented 4 months ago

In the meantime, did you try out the workaround using tags, by any chance?

Hey @dbeatty10 apologies for the slow reply. Unfortunately the tags approach doesn't work for our current setup...

production-frankfurt:data ❯ dbt test -s alerts --exclude tag:skip
15:31:20  Running with dbt=1.8.1
15:31:21  Registered adapter: snowflake=1.8.2
15:31:21  Unable to do partial parsing because profile has changed
15:31:29  Encountered an error:
Parsing Error
  Unable to find model 'snowflake_db_transformations.alerts' for unit test 'unresolved_qualified_and_resolved_alerts' in models/inbox/alerts/alerts.yml

In the example above the alerts model has a unit test which has been tagged with skip but because this model is disabled in the selected environment (production-frankfurt) the compilation step fails before the test command can be run. Does that make sense?

I think the only way to make the tags approach work is to enable all models in all environments (and use tags to control the execution of models in different regions). We tried this approach but ran into complications with slim-ci because the manifest.json doesn't accurately reflect the models deployed in a given environment (it includes all of them regardless of whether or not the model is deployed in the given environment).

ChenyuLInx commented 1 month ago

Let's do #10540 together with this issue.

maxmarcon commented 1 month ago

I can confirm that the tags approach doesn't work

adionisio8 commented 3 weeks ago

Hey, any idea when this would be implemented? Our team would like to disable unit tests from running using this flag.

graciegoheen commented 2 weeks ago

@adionisio8 we're hoping to get this merged soon! PR has been opened