dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.95k stars 1.63k forks source link

[CT-1505] [Feature] Freshness checks should repeat errored tests at the end #6248

Open FridayPush opened 2 years ago

FridayPush commented 2 years ago

Is this your first time submitting a feature request?

Describe the feature

DBT's command dbt source freshness performs a test against all sources in a project. However if there are a long list of sources, say 70+, one error or warning gets lost in the results printed at the end. Additionally there is no text printed that lets a user know any test failed.

Screen Shot 2022-11-14 at 3 56 39 PM

There were 2 failures and 2 warns on that run. I would like to propose a final paragraph similar to dbt test that lists all failed source checks, and lists warnings.

Describe alternatives you've considered

No response

Who will this benefit?

CI and Manual runs of source freshness checks will be easier to mentally parse with a summary block at the end of a run. Additionally printed text of the result, rather than only relying on exit codes, would be appreciated.

Are you interested in contributing this feature?

No response

Anything else?

No response

jtcohen6 commented 2 years ago

@FridayPush You're so right! This one has bugged me in the past. Thanks for taking the initiative to open it as an issue.

Problem

This is the code already in the freshness task today:

https://github.com/dbt-labs/dbt-core/blob/66ac107409749ff1cf5dafeab371dd1baf916b9f/core/dbt/task/freshness.py#L181-L186

There are two issues here:

https://github.com/dbt-labs/dbt-core/blob/66ac107409749ff1cf5dafeab371dd1baf916b9f/core/dbt/task/run.py#L470-L472

Proposed resolution

Let's make it work like all the others!

I started poking into this, and it ended up requiring a more involved set of changes than I hoped. It's still relatively self-contained, though. Here's one potential set of changes that illustrate what's needed: https://github.com/dbt-labs/dbt-core/commit/e34e6fe3685e5cdeabbc5d1e91183dffef8875fb

The biggest change is that SourceFreshnessResult will start having a message attribute. If we wanted to start including this field in sources.json, it would also represent a change to our metadata contract.

Example

$ dbt source freshness
14:16:39  Running with dbt=1.4.0-a1
14:16:39  Found 1 model, 1 test, 0 snapshots, 0 analyses, 289 macros, 0 operations, 1 seed file, 3 sources, 0 exposures, 0 metrics
14:16:39
14:16:39  Concurrency: 5 threads (target='dev')
14:16:39
14:16:39  1 of 3 START freshness of my_src.my_fresh_tbl .................................. [RUN]
14:16:39  2 of 3 START freshness of my_src.my_stale_tbl .................................. [RUN]
14:16:39  3 of 3 START freshness of my_src.my_warn_tbl ................................... [RUN]
14:16:39  2 of 3 WARN freshness of my_src.my_stale_tbl ................................... [WARN in 0.03s]
14:16:39  1 of 3 PASS freshness of my_src.my_fresh_tbl ................................... [PASS in 0.03s]
14:16:39  3 of 3 ERROR STALE freshness of my_src.my_warn_tbl ............................. [ERROR STALE in 0.03s]
14:16:39
14:16:39  Completed with 1 error and 1 warning:
14:16:39
14:16:39  Failure in source my_warn_tbl (models/some_model.yml)
14:16:39    Last updated 217 days, 1:43:24.449444 ago. Expected no more than 4 hours.
14:16:39
14:16:39  Warning in source my_stale_tbl (models/some_model.yml)
14:16:39  Last updated 217 days, 1:43:24.441473 ago. Expected no more than 1 day.
14:16:39
14:16:39  Done. PASS=1 WARN=1 ERROR=1 SKIP=0 TOTAL=3

Areas needing further improvement:


Okay! Is this something you (or another community contributor) might be interested in working on?

kxzk commented 1 year ago

@jtcohen6 I'd be open to working on this

jtcohen6 commented 1 year ago

@kadekillary hooray! give it a go, let us know how it goes :)