dbt-labs / dbt-project-evaluator

This package contains macros and models to find DAG issues automatically
https://dbt-labs.github.io/dbt-project-evaluator/latest/
Apache License 2.0
440 stars 64 forks source link

New Source Freshness test -- how to exclude static tables that are sourced from a database #490

Closed tuday2 closed 1 month ago

tuday2 commented 1 month ago

Describe the bug

I recently updated to 14.x and while I use source freshness for most of my tables, I have a table that is sourced from a database that has static data. How do I exclude this table from the source freshness test? I've tried to add it to my dbt_project_evaluator_exceptions seed however it doesn't seem to work.

Steps to reproduce

N/A

Expected results

N/A

Actual results

N/A

Screenshots and log output

N/A

System information

  - package: dbt-labs/dbt_utils
    version: 1.3.0
  - package: dbt-labs/dbt_project_evaluator
    version: 0.14.1

Which database are you using dbt with?

The output of dbt --version:

Versionless (Cloud)
b-per commented 1 month ago

Hi!

Using the dbt_project_evaluator_exceptions seed should work like for other tests.

tuday2 commented 1 month ago

I am using the seed for 3 other tests that are working correctly so I am certain its loading in the correct order:

Seed:

fct_name,column_name,id_to_exclude,comment
fct_root_models,child,dim_date,Date dimensions generated with dbt_utils date_spine
fct_root_models,child,dim_date_iso,Date dimensions generated with dbt_utils date_spine
fct_multiple_sources_joined,child,stg_%_unioned,Models called _unioned can union multiple sources
fct_sources_without_freshness,child,stg_XXX_historical_calls,Exclude static table that has legacy calls for XXX

(I also tried using the source name "legacy.XXX_historical_calls" in place of the staging table name and that gave the same error)

Output from FCT_SOURCES_WITHOUT_FRESHNESS table (1 row): legacy.XXX_HISTORICAL_CALLS

b-per commented 1 month ago

What about

fct_name,column_name,id_to_exclude,comment
...
fct_sources_without_freshness,resource_name,legacy.XXX_HISTORICAL_CALLS,Exclude static table that has legacy calls for XXX

The column name in the fct is not child for this table, it is resource_name

tuday2 commented 1 month ago

That did it! thank you.

I had to use the exact case -- which normally I don't since I'm using Snowflake, however this is a reference to a source so that makes sense.

Is "resource_name" in the documentation somewhere? All the examples show "child" -- it may help to add a few more examples in the docs.

fct_sources_without_freshness,resource_name,legacy.XXX_HISTORICAL_CALLS,Exclude static table that has legacy calls for XXX
b-per commented 1 month ago

resource_name itself is not documented as it would depend on each table, but we have this in the docs

column_name: the column name from fct_name we will be looking at to define exceptions

I might add another example in the docs where the column is not called child

tuday2 commented 1 month ago

What do you mean it would depend on each table? i literally put "resource_name" in there

b-per commented 1 month ago

I mean that to add exceptions for the table fct_sources_without_freshness, there is only one filter, resource_name.

But any column returned from any of the fct_ tables can be used to set exceptions.

Taking fct_exposures_dependent_on_private_models for example, we return the columns exposure_name, parent_resource_name, parent_access and parent_resource_type

Any of those columns can be used in the seed if we wanted to exclude specific rows.

Does it make sense?

tuday2 commented 1 month ago

Wow ok -- I never made the correlation between that and the name of the column in the evaluator table -- all makes sense now.

I honestly thought child was some default for a fact table --- just never clicked. I see now the name of the column in the evaluator tables is what you reference.

All good -- thanks again for the quick help!