Open BugenZhao opened 1 year ago
strong +1
some points
Error's
source
chain gets broken when reporting error, so the actual cause for the error is lost.
We might need to revisit this to see whether other places have the same problem. I didn't care much about source
before.
Besides, developers also find it annoying to deal with errors...
We might need a developer guide (and/or crate rustdocs) for developers to follow. Even I've read a lot about error handling before, I feel still confused about how to do it. A short and actionable HOWTOs (and some simple explanations) should reduce a lot of confusions. It can be opinionated, but we need some guide.
Find it hard to categorize errors into different types or enum variants.
This is especially confusing. I'm now wondering whether categorization is needed at all.
Directly wrap anyhow::Error into a new error type
I recently noticed wasmtime::Error
is like that. I had thought it should be an enum or sth, since it's a library. It turns out they use error.downcast_ref::<Trap>
to get the "kind" (Trap
). Might worth checking how they do it.
Some arbitrary refs (for myself to read later)
BTW, I like this sentence: "Be either actionable or informational" (from https://github.com/awslabs/smithy-rs/issues/1950)
At a high level, errors were refactored to:
- Be either actionable or informational. Actionable errors can be matched upon, leading to different program flow. Informational errors indicate why a failure occurred, but are not intended to be matched upon.
- No longer print error sources in Display impls. A DisplayErrorContext utility has been re-exported in the types module to easily log or print the entire error source chain.
- Reliably return their cause/source in their Error::cause/Error::source impl
P.S., I don't know whether the result of their refactor is good, since it has some learning burden for me. Haven't check its every details yet. AWS SDK is complex beasts though. 🤣
- No longer print error sources in Display impls.
I also found this a good practice we can follow. This is also how anyhow
handles Display
.
https://github.com/dtolnay/anyhow/blob/7fc0c073c4c31ef8664f699e9e4883b1c92b2fb6/src/fmt.rs#L7
- No longer print error sources in Display impls.
I also found this a good practice we can follow. This is also how
anyhow
handlesDisplay
.
However, not all libraries follows the convention. So eventually we get https://docs.rs/snafu/latest/snafu/struct.CleanedErrorText.html 🤣.
Tracking the task:
thiserror
way, used for most crates
Construct
and ContextInto
from thiserror-ext
, introduction covered in #13200Macro
, introduction: #13627anyhow
-wrapper way, used for connector
crate now
source
chain, including RPC
source
chain.
DisplayErrorContext
anymore: #15225There are some good discussions. Note here for future reference.
thiserror
: what does its attributes do; when to add #[source]
and #[backtrace]
(almost always)This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.
As more and more features are introduced in our system, it turns out that error handling and reporting become increasingly messy at the same time. We're struggling with several issues, including...
source
chain gets broken when reporting error, so the actual cause for the error is lost. (#10993)Besides, developers also find it annoying to deal with errors...
enum
variants.RwError
still lives in the frontend and batch executors even though we proposed to clean it up one year ago. (https://github.com/risingwavelabs/risingwave/issues/4077)anyhow
orbail
extensively, resulting in a messy categorization of the majority of errors asinternal error
.Based on this, I propose several guidelines on how we should deal with errors in different modules in RisingWave:
Use
snafu
(orthiserror
) to define errors, if a considerable usage of this error is to construct new errors, and the rest is to propagate errors attaching contexts optionally.library
, likeExprError
,StreamError
, andOptimizerError
.String
everywhere.anyhow
as the fall-back variant of the error for the errors that cannot be easily categorized or can rarely happen.Directly wrap
anyhow::Error
into a new error type, if a considerable usage of this error is to propagate errors from lower-level crates without strong necessity of attaching contexts, and the rest is to construct new errors.application
, likeStreamExecutorError
.