dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.63k stars 1.59k forks source link

Raise exceptions for test failures #3120

Closed ruisun-pep closed 3 years ago

ruisun-pep commented 3 years ago

Describe the feature

The ultimate goal is to have automated response based on test results.

The proposed way is to raise an exception when a test fails, e.g. UniqueTestFailureException, GenericTestFailureException, etc. There are automated solutions (e.g. sentry) that can capture exceptions (as well as other information passed with exception such as table name) in a structured way and apply rules (e.g. alert different people).

In addition to the convenience, raising exceptions is also the proper way to signal "something wrong happened". Ways of handling them can vary (e.g. logging to some location).

Describe alternatives you've considered

Additional context

This feature is not database dependent.

Who will this benefit?

Any one that is using dbt's test feature and wants to have automated pipelines based on test results.

Are you interested in contributing this feature?

Yes

jtcohen6 commented 3 years ago

Hey @ruisun-pep, I'd encourage you to take a look at https://github.com/fishtown-analytics/dbt/issues/2915, as I think it covers a lot of the ground you're interested in here.

In particular:

  1. If you want to parse structured logs, you can specify --log-format json (docs). Those logs could be improved in terms of structure, organization, and completeness (hence the other issue), but that's definitely a place to start. Also, while we do our best to keep log formats consistent, there are no guarantees or commitments around breaking changes.
  2. If you want to know about the results of a dbt invocation, the best place to get that information is from dbt artifacts, in particular run_results.json (docs). Those JSON artifacts have a versioned, documented schema with contracts around breaking changes.

What's not above? A tight integration with a specific automated exception-catching solution, such as sentry. To my mind, it should be possible for end users to build an integration in that vein by wrapping (1) error-level structured loglines or (2) error-status entries in the results object produced by dbt. To your mind, are there other pieces missing?

(I'm inclined to mark this issue a duplicate of #2915, but I recognize this is a tricky topic, so I want to make sure I first understand the field of play.)

ruisun-pep commented 3 years ago

Thanks for pointing me to the right direction! I believe that is exactly what I was looking for.