Open gshank opened 2 years ago
Copying from Nate's comment in https://github.com/dbt-labs/dbt-core/issues/4836#issuecomment-1104038063, because the underlying thrust of it continues to feel important:
whenever possible, we want to be able to programmatically distinguish when the root cause of an error is a warehouse issue, an adapter issue, or a core issue. This will be a big improvement in the process for debugging failed runs.
Basically, when a dbt run fails, how can we tell whether it was user error, database error, or internal dbt error?
In a Pythonic world, we'd expect to do this via class inheritance and type introspection. Given that our log events are now proto-composed, we probably need a field on each exception-log. Something like exception_cause
or exception_category
. Categories I can imagine:
dbt-core
that's disallowed by require-dbt-version
This can probably be accomplished with an enum that defines all the categories. Without verifying the syntax I think something like this would work:
enum ExceptionCause {
{
PARSING = 0;
DISALLOWED = 1;
DATABASE_CONNECTION = 2;
DATABASE_QUERY = 3;
UNHANDLED = 4;
...
}
And then within the message add
message SomeException {
EventInfo info = 1;
repeated ExceptionCause exception_cause = 2;
...
}
This also allows the flexibility to add more types to the end of the enum which I'm sure we'll end up needing.
Blocking question we need to answer: Should we have separate LogCategory
versus ExceptionCategory
, or combine the two into a single field?
(Idea with LogCategory
is to formally serve the role that the event codes' starting letters have been informally playing: this is parsing-related, deps-related, adapter-related, ...)
One of the unknowns here is exactly what would be useful to consumers to get logged. We likely don't want to log all exceptions, just specific ones. Is it best to fire an event as part of those exceptions or where the exception is raised?
Another option would be to add more structured information to MainEncounteredError
. This may cover a lot of use cases and then just give a single event that would need to be subscribed to for exception details. The exception could contain the error category and then the MainEncounteredError
event could pass the category (connection error, etc) in the proto message.
Minimum: Easily differentiate between errors in the users code or errors related to dbt itself
Better: Categorize Errors
Another option would be to add more structured information to
MainEncounteredError
. This may cover a lot of use cases and then just give a single event that would need to be subscribed to for exception details. The exception could contain the error category and then theMainEncounteredError
event could pass the category (connection error, etc) in the proto message.
After chatting about this with @isaac-taylor, this might get us a lot of what we need, at least in the medium-term. It would be a quality-of-life improvement over parsing the string exc
from MainEncounteredError
.
Even better if that catch all error had some sort of error code that we can use to categorize the error to either our system, or the user's code (like connection problems)
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.
Right now we have structured logging events for the majority of our logging, except for exceptions. In some but not all cases we have a separate logging event. Ideally all exceptions should be logged automatically. This ticket is labeled a "spike" because in order to scope this work we need to identify a mechanism/policies for doing this.