Open mrinalwadhwa opened 3 years ago
Hey, I would like to help implement this, but I'm new to ockam. I see the error definition has an error code and domain string.
pub struct Error {
code: u32,
#[cfg(feature = "std")]
domain: String,
}
What should those be when there is a std::io::Error
?
@mcastorina thank you for picking this up π @jared-s @SanjoDeundiak @antoinevg what are your thoughts on good values here.
I think we need to take a deeper look at the design of our Errors so that they carry source, backtrace, cause etc. as they propagate within Ockam or surface to a user. We chose the code and domain approach because we'll need to expose them over FFI. But that shouldn't hurt the experience of Rust users. Rust Users should get source, backtrace, cause etc.
I'm excited to contribute! :)
Perhaps https://github.com/dtolnay/thiserror will be useful here.
@mrinalwadhwa I'm not sure we should implement From
As for extending Error information with stacktrace and additional data, that would be great, but I'm not very familiar with how that works and how that can or can't be compatible with our Actor framework. There's some research work to be done by someone
@mrinalwadhwa I your example, you are writing an application so I don't think you should return ockam::Result
but anyhow::Result
instead in order to be more idiomatic (see anyhow
vs thiserror
) then you can use Context
which is an easy way to produce useful error messages for the user. Plus, the error in your example has nothing to do with ockam
, it comes from reading the standard input which depends on the OS.
Anyway, IMHO @mcastorina is right about the usefulness of thiserror
: it can be used on struct
s and does not bring breaking change, it supports backtrace
, source
, From<Error>
automatic implementation. As for ffi
, the implementation can be adapted to the possible changes.
Nice to see love for dtolnay/thiserror and dtolnay/anyhow! IMO they're excellent defaults when it comes to Rust error handling.
That said, Ockam has some special needs.
@mrinalwadhwa already mentioned the need to support FFI but our existing error handling strategy also has a number of important advantages for targeting embedded (no_std
) platforms:
code: u32
field only consumes 4 bytes and the domain: String
field is optional. This matters because we want to be able to target resource constrained platforms. 1ockam_core::Error
type which allows us to have per-crate error definitions while avoiding the Box<dyn std::error::Error>
pattern. This matters because no_std
:
std::error::Error
trait and Box
type. 2So the big question is, without losing these advantages, how can we get features like:
It would be nice to also support these on no_std
but, just as with FFI targets, it's not the end of the world if we can't have all of them.
i.e. Neither thiserror or anyhow 3 provide full no_std
support but this is not necessarily a problem if it's possible to integrate them with our existing error implementation in a way that won't require a bunch of #[cfg(not(feature="std"))]
boilerplate to avoid breaking no_std
builds.
In general, the situation is not great when it comes to idiomatic unified error-handling and no_std
and is unlikely to improve in the future:
So maybe we should also be looking at using a collection of simple crates & code to implement our wish list individually?
A couple that spring to mind are:
no_std
support)no_std
Can anyone suggest some more?
Finally, Rust's error-handling story is starting to solidify but there are still a few big changes coming (e.g. RFC-2504) so it would be nice to not lock the library or dependent apps into anything that's going to be deprecated.
[1] One change I would like to see here is to define domain
as a &'static str
rather than String
as, again, String
needs alloc
when we're targeting no_std
.
[2] We can get Box
via alloc
but requiring alloc
for something as fundamental as Error handling is probably not a great idea.
[3] The anyhow crate does provide partial no_std
support but this comes at the cost of relying on alloc
.
https://github.com/ockam-network/ockam/issues/1655#issuecomment-894653125 @mrinalwadhwa In your example, you are writing an application so I don't think you should return ockam::Result but anyhow::Result
+1 to use anyhow
when writing client applications.
https://github.com/ockam-network/ockam/issues/1655#issuecomment-896053001 So the big question is, without losing these advantages, how can we get features like:
- backtraces / error-chaining
- converting a numeric error code back to a string representation of its originating enum type
- human-readable error messages
I agree with @antoinevg that the best thing for now is to wait. Given my recent experience (which you can read below), points 1 (backtraces) and 2 (u32 to strrepr) should be easier down the road if errors don't contain parameters. Finally, point 3 (human-readable error messages) should be feasible if the core::fmt::Display
trait is implemented, right?
Here's my two cents on whether enum's variants should contain parameters to build custom messages or not and how this impact the above three points.
In the project I've been working on for the past two years (close to 100k lines) we use thiserror
. We don't use any feature other than the Error
and Display
trait implementations to save us from writing some boilerplate code. That is, we don't make use of the backtrace
or source
features.
We follow a similar approach to what's done in the ockam
crates. We have a struct
like the following:
pub struct Error {
code: ErrorCode,
friendly_message: String,
internal_message: String,
}
where:
ErrorCode
: is a string code, where multiple errors can return the same code (project requirements)internal_message
: prints the Display
implementation, which contains a detailed description of the errorfriendly_message
: prints the message we want to display to the final user (API), usually a simplified version of the internal_message
Then we have a bunch of enums
with the errors of a given layer (api, commands, infrastructure, core, ...) implementing the From<AbcdError> for Error
, where each variant can have arbitrary parameters. For example:
#[derive(thiserror::Error, Debug)]
pub enum CustomError {
#[error("score below threshold [expected={}][actual={}]", _0, _1)]
ScoreBelowThreshold(u32, u32),
#[error("unknown error")]
Unknown,
}
We really struggled for a while deciding how to deal with error handling. We were also new to Rust. And, honestly, we are not 100% happy with this approach, but it has proved to be pretty solid so far. Having a good error handling strategy can make a huge difference in the developer experience, that's for sure.
The biggest disadvantage we've found: variants with parameters make debugging easier but are limiting if you want to use thiserror
's source
and backtrace
features, for example. Having parameters also prevents us from converting an ErrorCode
back to the original error. So unless it's strictly necessary, I would use the parameters to build log messages and keep the errors variants without parameters.
I've been thinking about this a bit, and feel like I should write down my thoughts in case I'm getting something massively wrong about the ockam
codebase β totally possible, I'm very new. A lot of this is informed by my work on the stdlib and mozilla/application-services (which had some similar constraints, but, well, fewer).
I think a design like the std::io::Error
type is probably most appropriate for this. This should allow us to avoid giving up the ability to expose things on the other side of the FFI cleanly (at least in some cases). In particular, the fact that in io::Error, you always have an ErrorKind β for our errors we could enforce always having an error code and a domain (which I would change to a &'static str
, or even a &'static CStr
, or something that wraps it anyway, for FFI compat).
It hopefully also would allow us to avoid allocation in many cases, see the internal io::Error::const_new
function (or if https://github.com/rust-lang/rust/pull/87869 has landed, io::const_io_error!
... unless it has been renamed, which seems possible), and even for the custom error case, eventually (perhaps not immediately), something based on unsize
or stackbox
could be used to have a Box<dyn Error+Send+Sync>
that doesn't use the heap (it's also possible that heapless has options for non-heap-allocating boxed trait objects, I haven't looked). It would also allow natural support for backtraces, of course, on platforms and configurations where they're available.
Unfortunately, the approach you get with something like thiserror on an enum tends to be better if you don't have to ferry user errors through it. It also grows fairly large over time, and is hard to expose to the FFI, except manually. We could probably design our own proc macro for this (mostly for assigning error codes, which is otherwise tedious) and I do still consider that an option (I haven't made up my mind here), but... My experience is it's a bit of a headache, and in https://github.com/mozilla/application-services became a bit unwieldy. This approach feels sloppy and bad to use from user code unless you curtail that, which can be very difficult (although it's probably the cleanest and simplest to write, ironically).
Also, direct use of anyhow
(or similar, such as eyre
) from library crates isn't great, since those types don't implement Error
(they can't for coherence reasons), which means users can't wrap them. While we could invent something similar (I'm pretty familiar with the internals, and have patches in anyhow's low level guts), it wouldn't get the ergonomics we need, and I'm unsure that we could do it in a way that maintians both the no_std support we want, as well as the ability to expose to the FFI.
All that said, one constraint I haven't fully worked through is serialization. In #967 the errors were reworked due to the need to be deserializable, and that... complicates the design somewhat, especially the desire to have static data. Hrm.
Sorry that this is a bit disjointed, I felt like getting it down when I had the thoughts rather than waiting until morning (which may have been advisable).
(This is slightly related to #1562, which is partially about FFI error ergonomics)
Great article about std::io::Error
for everyone (like me!) who didn't realize just how neat the implementation is:
https://matklad.github.io/2020/10/15/study-of-std-io-error.html
Yep, it's a great and easily-overlooked example of Rust API design (and matklad's whole blog is top-notch, I highly recommend it).
All that said, one constraint I haven't fully worked through is serialization. In #967 the errors were reworked due to the need to be deserializable, and that... complicates the design somewhat, especially the desire to have static data. Hrm.
Having slept on it, I think there are two approaches here I hadn't really considered yesterday (and it's possible that the right solution will involve both in some capacity).
Use heapless::String
. Even if we have the stdlib, it doesn't seem unreasonable to put a max length on the "error domain" string, although maybe I'm mistaken here.
Have deserialization convert a &'static str
into a String
.
For example:
// This is a private inner type that isn't exposed outside of ockam_core.
enum Repr {
Static { code: u32, domain: &'static str, ... },
NonStatic { code: u32, domain: String, ... },
// ...
}
When serializing Repr::Static
, it would contain data independent of which variant it used, but it would deserialize as Repr::NonStatic
. This means we'd need to write manual Serialize/Deserialize implementations, but I don't think this is the end of the world.
A variant of this approach is to pack the &'static str
and String
into one bundle, similar to a Cow<'static, str>
. (Other crates to see along these lines are https://github.com/thomcc/arcstr, and https://github.com/maciejhirsz/beef, although I don't think of these we'd actually use).
I still have to work through my thoughts here on whether or not this actually buys us anything (aside from some perf), but it's an interesting consideration I hadn't thought of when making the writeup last night.
(Sorry for all the messages, this is both one of my first issues here, and also something I think is critically important to having a good-feeling API, so I'd rather over-share my thought process than under-share it. It also serves to document some of the logic for future users)
Multiple messages are great. Don't hold back .. your train of thought is insightful and I'm learning a lot. Same for thoughts above from @antoinevg & @adrianbenavides.
Some further thoughts on this came to me over the weekend.
For one, I don't really know what the use of the error domain is, besides helping out the other side of the FFI, perhaps.
The second takes some explanation:
One hesitation I have with the io::Error
-like scheme is that it won't work with ?
. I alluded to this kind of issue before by mentioning that anyhow/eyre can't implement Error for coherence reasons, but essentially the issue is:
In order for ?
to work, some error type (lets call it MyError
but obviously in this case it would be ockam_core::Error
β I'm being vague just because anyhow::Error
and eyre::Error
hit the same issue) needs to be From<UserErrorType>
. If you want to handle this for all possible user error types, you want to do this as
// This impl is referenced to later as (*1)
impl<E: std::error::Error> From<E> for MyError { ... snip ... }
Then, to allow users to interact with our errors nicely, we need to implement std::error::Error
. This will let them put it in a type-erased error container like Box<dyn Error>
, io::Error
, or even {anyhow, eyre}::Error
(or failure
, from back in the day):
// This impl is referenced to later as (*2)
impl std::error::Error for MyError { ... snip ... }
Unfortunately, we can't. Consider the case above where we invoke (*1)
on MyError
, that is, we're calling From<MyError> for MyError
. Well, there are actually two of these. There's the (*1)
we wrote above, and then there's one in the stdlib. Every type T
already impls From<T>
. This is partially so that x.into()
is a noop if you're already the required type, and generally is useful for generic programming. The coherence issue is that you can't have both of these.
(There are additional complexities too β From<E> for Box<dyn Error>
existing tends not to help matters when trying to work around this issue...)
There are two solutions to this:
From<E: Error>
(e.g. (*1)
), the way io::Error
does, and have users go through an Error::new
-constructor which hurts ergonomics.impl std::error::Error
(e.g. (*2)
), the way anyhow
/eyre
(or failure
in the past...) do, and make it harder for your users to "get away" from your error type. I think there might be things we can do here, but we don't really want to be in a space competing with anyhow/eyre on our error type βΒ it would be needless.I've been leaning towards 1, part of this is that I don't know how we can ever have a From
impl that assigns the correct Error
domain, and assigns a reasonable Error code that the user chooses1...
But the downside here is that it's friction for anybody writing an ockam::Worker
implementation β kind of a lot of friction. Users would have to quite frequently do .map_err(|e| ockam::Error::new(e, ...))?
everywhere you need to return an error inside an implementation of your ockam::Worker
, which kind of sucks2. However, the solution-space here could just be to tweak the Worker
trait, or the API on Context
that accepts a worker... Which makes me think that this is related to the problems in https://github.com/ockam-network/ockam/issues/1564 or at least could share part of a solution.
That is, one way you could imagine this working is:
type Error: std::error::Error + Send + Sync
to ockam::Worker
.ockam::Worker<Error = UserErrorType, ...>
.ockam::Error
by converting the return values, which effectively type-erases the Error
parameter.This still has a few issues. The main one being that now we might end up with ockam::Error
s that wrap ockam::Error
s, which is kinda janky. In practice it's not that big of a deal... but an approach based fully on adjusting the #[ockam::worker]
(and possibly #[ockam::node]
) proc macros might be able to do some magic under the hood to avoid it (faking specialization via autoref, for example).
I haven't gotten that far in experimenting here (it basically came to me over the weekend that this approach might be solve more problems), but so far it seems promising.
1: I'm also not sure how to ensure users pick a good error code, though β even as it is we probably have cases where the error code contains a u32
payload, which depending on the value, can theoretically cause it to overlap with another error code. It's not clear this can happen in practice, though, probably not.
I also have been imagining that maybe implementing a proc-macro that auto-assigns error domains / codes like #[ockam::error(code = ...)]
, similar to a thiserror
-style thing. I'm not sure this is worth the trouble, though.
2: It's not as bad for io::Error
of course, because you don't implement the io traits yourself very often, you just use them. Whereas, I get the impression that implementing ockam::Workers is a significant part of the well, "whole point".
@thomcc I wonder how backtrace looks like if error happened inside a Worker/populated to the Worker. When you debug Workers, backtrace that debugger shows you is useless.
I need to do some experimentation to answer that, but it's a good point.
Over the weekend I've been revisiting the error design, which has been fairly fruitful β I didn't really understand the issues or design constraints last time... and I think the proposals I made would not have helped.
Or, they'd help a little, but would have left many of the same problems that exist in the current design[^1]. (In my defense, it was my first task).
[^1]: For example, the error codes would be just as bad. A proc-macro to help generating them would solve essentially none of their problems, even if it would be nifty. Backtrace/SpanTrace would be nice to have (they are are better than nothing), but they are not going be magic β our notion of a "call" for us is not the same as a stack frame, and its probably not a tracing span either... Etc
Anyway, after a lot more understanding of ockam, how it's used, and its constraints, and a bunch of research... I think I have come to a good understanding about the use cases ockam_core::Error
needs to support, and how to get there.
Note that... this is mostly ignoring things that I think are bigger issues that I intend to address in another write up (that I have in progress), like swallowing errors, serialization, and several more.
Anyway, here's my current thinking about what ockam_core::Error
should be good at:
ockam_core::Error
requirementsockam_core::Error
has (at least) three important use cases:
Communicating errors in a way that other code could interpret and interpret and handle. This should work across API and protocol boundaries, even to remote machines.
The intent behind the current error design seems to address this case, as you need to know both of those things. The implementation is very hard to use in practice though, and having only this seems to cause problems.
It is worth noting, that the current node implementation ignores errors from message handling after logging them. In order to support doing better things than this, the system probably needs to be able to distinguish between different severities of errors.
That said, I have another writeup on this that I'm working on, so I won't get into it here too much.
Communicating an error in a way that someone who is not an expert in ockam
[^2] can understand the error quickly, and if the error indicates a problem they need to fix, resolve the problem easily -- even if they are .
This also should work across remote machines, but probably doesn't need to be that much more than decent error message.
Needless to say, we ockam_core::Error
's current implementation does not help here.
Communicating an error with sufficient context and detail so that a developer working with or debugging Ockam (or ockam-using code) can identify and fix the problem.
Largely speaking, this needs to work very well on the same machine, but (at least in debug builds or when an environment variable is enabled), it should have some support for transmitting it remotely. (That said, I think most of the unserializable stuff here can be handled by converting to a string or similar).
This is also only relevant for certain kinds of errors β routine I/O failures probably don't need this kind of handling, at least by default
As you probably already know, ockam_core::Error
currently has no support for this.
[^2]: Perhaps they're a developer getting started with it, someone who is operating a ockam-using tool, or invoking the Ockam CLIΒ tool.
(Caveat: I have a bit more of the design worked out than this, enough to have cleared up most holes, but still not 100% and probably am not summarizing it in enough details anyway)
I think all three of these have different solutions. My current thinking for how to handle the use cases above is (in reverse order):
Location::caller()
and #[track_caller]
to get the file/line/etc where the error is generated, and even collecting a Backtrace and/or SpanTrace at the site of capture if an env var is flipped. (Anyway, I'm aware that I'm handwaving this one, and my plan here is actually pretty cool, but I haven't finished it and want to move on)[^3]: Assuming fixing the log-and-ignore error handling we have in some places.
For the 2nd β error messages a technical non-expert can understand β probably requires either requesting an explicit error message. It might be enough to accept a cause
error too. This seems an easy-ish one, and will help a lot especially moving forward, but if you can't tell
Finally, the first despite "sort of working for that use case" has several very significant areas where it can be improved β I don't think our crates should be in charge of assigning their own error codes for the most part (and I think we want to store real enums, meaning we get a human readable name for the code, etc).
Fixing this without causing issues for FFI and/or targets that can't rely on the message/metadata being present is... the hardest β see Appendix A for the current sketch of the design.
Anyway, thats all I have now.
ockam_core::Error::code
.Basic idea: two new enums (at least) are added to ockam_core
that describe "canonical error codes" and "canonical error origins". Something like:
enum ockam_core::ErrorOrigin
, which describes the component at the level of detail like Transport
, Vault
, Node
, etc (probably things like Application
and Unknown
too).
We can probably automatically assign this in most cases based on the module_path of the code that produced the error, which we should be able to get.
Note that this explicitly is not at the level of granularity of crates, modules, or even conceptual components β (e.g, ockam_transport_tcp
and ockam_transport_websocket
are both ErrorOrigin::Transport
)
enum ockam_core::ErrorKind
, which looks a lot like std::io::ErrorKind
, but with a set of errors likely to happen inside ockam, rather than a general set for all I/O.
The main idea behind these is that most code that checks errors will probably prefer writing checks like
use ockam_core::{ErrorKind, ErrorOrigin};
let is_transpor_timeout = (
e.origin() == ockam_core::ErrorOrigin::Transport &&
e.kind() == ockam_core::ErrorKind::Timeout
);
if is_transport_timeout { ... }
which then handles all transports, rather than having to figure out which error code number comes from TCP, which from WS, etc.
In #2566 @spacekookie and I overhauled the error API.
While I believe that this is largely an improvement, It revealed it has a number of areas needing further refinement, and several aspects did not pan out as well as I had hoped:
no_std
insufficiently considered during the design:This part of the design completely failed to pan out, IMO. The root issue is that the trait bounds used to accept "any error" under feature = "std"
are non-viable under no_std -- we have a compat trait, but nobody implements it.
The largest problems this causes are two-fold:
This means that with this change, code that previously could ignore no_std now must special case it.
We essentially lose all ability to have any error information beyond the code in the no_std case. This will make it even harder to debug what went wrong than it was before.
While there's a lot of documentation for the error kinds/codes there's insufficient guidance on usage of the error container. For example, I invisioned we'd use Error::new(kind, origin, "message ...")
to construct things if no cause is present, for example. (This is basically how something like std::io::Error::new
gets used)
In practice I think we should not need the per-crate error enums much. They encourage us to discard error causes, and require a significant amount of boilerplate.
Instead, I'd rather us propagate errors directly
One development since this design is that I think we no longer actually need errors to be serializable (I believe it was only used by VaultWorker, which has been removed).
This would simplify the design, and allow us to support downcasting to the cause (or even allow anyhow
-style downcasting to any member of the set of causes). This is desirable because it allows users of ockam to return an error from a worker and losslessly get it back from the other side.
no_std
The biggest area of uncertaintly I have is how to improve things so that the no_std situation is less of a problem.
I think one possible fix may be to add blanket impls to ockam_core::compat::error::Error
under no_std on anything which is Display+Debug. That said, this is a meaningful divergence from the trait it attempts compatibility with, so it probably requires some rethinking the trait entirely.
Without the serialization constraint, it would be easy for us to store:
&'static Location<'static>
to us, which is only a single pointer in size, and requires no allocation. I suspect this is acceptable on embedded (that said, we would probably allow disabling it, since it the location info adds to the static data in the binary).dyn*
/ThinBox
intend to use and store it inline in the case where it's smaller than a pointer. This would hopefully allow supporting some information about the error cause.Much of this can be done in a more incremental way than was attempted this time.
I agree with most of your observations, except:
Too much boilerplate in per-crate error enums
In practice I think we should not need the per-crate error enums much. They encourage us to discard error causes, and require a significant amount of boilerplate.
Instead, I'd rather us propagate errors directly
The problem I see with this approach is that we will return errors that don't mean anything to users. If you look at the breakdown from the ockam_node miro board, tons of errors are because of an internal sender problem. Throwing a tokio error at the user is extremely unergonomic, so we'll want to wrap the error with some amount of context at least.
Similar to decoding issues. serde_bare errors are very obtuse so instead we should wrap them in some context. Down the line, at least in ockam_node
and ockam
I would like to keep some kind of failure journal. Because lots of our errors are red herrings (i.e. tokio failed or bare failed) we need to be able to keep track of other events that happened in the system, for example when a receive fails, we can check the error journal or some other state, see that the node was shut down, and then return a Rejected
instead, etc.
Overall crate-local errors (or even module-local errors) should be considered part of the API, not just an afterthought. And especially not something where we should just propagate errors from layers that our users neither know or care about
so we'll want to wrap the error with some amount of context at least
This is what the Error::context
is for, mostly. The use I imagined was failure/anyhow context chaining like:
some_tokio_func().await.map_err(|e| {
Error::new(Origin::Node, Kind::Io, e)
.context("reason", "some_tokio_func() is upset!")
})
(That said, I don't blame you for not realizing this -- the fact that it's key/value so that stuff like errors from workers could contain which worker type/instance caused the issue probably made this confusing)
Or is there a reason that's not viable?
I think I need to play around with the attached context API a bit more. It's certainly doable. I still think that having a per-API error facade to abstract the most common problems is a good design decision. That being said, the current one is bloated and kinda useless so I'll look into how and what can be improved here
especially not something where we should just propagate errors from layers that our users neither know or care about
I partially agree. I think you should have the underlying error available in some way -- dropping it on the floor because it seems like an implementation detail is not useful behavior for a library.
And I think a lot of them are potentially relevant to to users of the API. For example, serde errors:
Map<K, V>
for if K
is not a string/integer, etc)(The use of BARE makes everything more annoying here since it has basically no ability to provide even remotely useful info...)
I also think tokio errors are a weird case. Tokio errors have a few flavors:
ockam_node
.A lot of these I feel like we are in the habit of just sending to without thought. The very vague hope was that the fact that there's easy anyhow
/failure
/Box<dyn Error>
autoconversion of any error type to ockam_core::Error
would eventually force us to think about this kind of stuff more. (That said I didn't expect this to happen during the initial landing)
I think I need to play around with the attached context API a bit more.
Ah, yeah, I think this is maybe what I mean by needing improved usage examples. It's probably pretty unclear that that's what that part is for. I had planned on integrating it myself, which would help there, but... well, yeah (thank you so much for taking it over though seriously).
That said I think this part of the API does have some real wonkiness:
The context keys are arbitrary strings. This is flexible to the point of being unclear. There should be an enum or something (maybe allowing arbitrary strings for cases not otherwise handled).
The context key should be optional. the most common case for the context in failure/anyhow in my experience is noting what you were doing when the error happened, so it should support that
Definitely need a extension trait adding .context
and .with_context
methods to Result<T, Error>
(which would do nothing for Ok(_)
, but add it to the error).
Assuming we don't need deserialization of these, the context entries should possibly store a &'static Location<'static>
.
Down the line, at least in ockam_node and ockam I would like to keep some kind of failure journal
Hmm... I don't really know what you mean by a failure journal.
Overall crate-local errors (or even module-local errors) should be considered part of the API
Yeah I mean, the need to go through traits really limits flexibility here. Same reason std::io::Error
can be used to round-trip user errors, as well as being a giant catch-all error type which can represent any problem that you can have when doing any sort of interaction with the OS.
I still think that having a per-API error facade to abstract the most common problems is a good design decision
So I'm not against this, for sure. In the direct user-facing API-surface there are very likely pieces where it doesn't really make sense to use something type-erased like ockam_core::Error
.
In the internals I'm unsure. I think we need to be considered about error handling there, and the errors aren't API-facing. If there's specific cases that need to be handled separately it makes sense though... (Hm, I guess I see how something like the Result<Result<_, _>, _>
stuff in that sled blog post might be a better way for some pieces here).
I'd... really like to try and avoid making it too easy/automatic to ignore errors (at least in some of the code).
Ah, okay. I'm pretty sure I have a solution to how to make the design not require different trait bounds on no_std
vs std
(in no_std
mode nothing was convertible into the error trait, because no_std-compatible code tends to not use a fallback for errors and just guards on #[cfg(feature = "std")]
).
no_std
.valuable
-style typeid checks) for a clean fallback for types that do not implement the error trait, solving the rest of the problem. This allows the function signatures to work without differences between no_std
and std
-- presumably embedded is pretty restricted with what it can store. It will also hopefully improve ergonomics on all cases, especially of the context API.
Also, it does turn out that nothing needs error roundtrip serialization (when the initial work was done, it was still needed, but it got removed since then). I had a quick meeting with @spacekookie this morning about this and it also seems that supervisors will not need it either (they need serialization of some data, but no requirement for deserialization)
This enables... major cleanups to the internals and the model. Actually, almost all the complexity in the code (and much of the complexity in the model) came from how painful serialization of some of this data is.
[^1]: We're already using core2::io
as the std::io
fallback, so switching to core2::error
seems fine.
I wrote
Got
Update:
I think we need to take a deeper look at the design of our Errors so that they carry source, backtrace, cause etc. as they propagate within Ockam or surface to a user. We chose the code and domain approach because we'll need to expose them over FFI. But that shouldn't hurt the experience of Rust users. Rust Users should get source, backtrace, cause etc.