Open annagrin opened 3 years ago
I think I buy that. Thoughts @natebosch @jakemac53 @lrhn ?
@natebosch looked at pulling out the error handling to do something different during the null safety push and we never got anywhere.
Might be worth revisiting, though...
Do we know what the rationale for the existing behavior was? It does seem problematic to me, but it also seems like somebody went to a fair bit of effort to make it behave that way?
@nex3 had good reasons, I'm sure – perhaps she can shed some light?
Yeah the current behavior is magic and confusing - the hidden interactions with error zones does cause problems.
https://github.com/dart-lang/shelf/issues/2#issuecomment-742081585
My preference is to remove anything related to error zones in this package, but we'd need to be careful rolling it out.
The OTHER idea.
Chatting with a community member about this.
Deprecate shelf_io.dart
and create another, more opinionated package to handle the "wire up the server" bits.
I am good with anything that does not change behavior if a top level handler is added, we would need to update a bunch of existing libraries in that case though. My current case - ddr has a top level handler that has been added to collect crash metrics, but it runs a bunch of http servers via libraries such as dds, dwds, devtools, build daemon etc.
If I recall correctly, the idea behind the current error handling was to be a backstop in the case where the user hadn't thought about error handling at all. Crashing the server on an error is about the worst possible behavior, so the current behavior says "if the server would otherwise crash, log an error instead, otherwise leave it up to the user". The expectation was that most servers would do more thorough error-handling, either by adding middleware that captured errors or by having some app-wide error handler (that is to say, running the server in a pre-existing error zone).
I think the root issue here may be that it wasn't clear enough that users should set up their own explicit error-handling. You could solve this in a non-breaking way by adding a message to the default error log saying "set up better error handling", and possibly also providing a built-in error-handling middleware.
Unfortunately it introduced the opposite behavior from the desired one in this case - if all serveRequests calls in all the libraries are used unwrapped and then someone adds a top level wrapper, server starts effectively crashing, because the top level zone was just added as a way to catch anything, print in nicely to the screen and exit. So it looks like the only safe way to use this API in a library is by wrapping it in runZoneGuarded with error handling, because there is always a chance for that top wrapper addition in the calling code later.
So it looks to me that the current API is incomplete, because it does not require error handling.
Maybe we can introduce another one that does not change behavior depending on calling code?
We can make it very similar to serveRequests
but add an error handler that is passed to the zone that serveRequests
unconditionally runs in. We can set the default error handler to do what it currently does - print 'Async error' and exit.
Can even eventually mark the old one as deprecated if needed.
@nex3
You could solve this in a non-breaking way by adding a message to the default error log saying "set up better error handling", and possibly also providing a built-in error-handling middleware.
This would be great, but in this case we need to inform the user where the error came from. The errors are asynchronous and look like "SocketException: OSError: broken pipe" with an unrelated stack at best.
Maybe we can remember the stack at the call to serveRequests
and add it to the exception to propagate to the lower runZoneGuarded
?
There is also a question how to handle those errors coming from other libraries, where running an API that starts a server in runZoneGuarded
is too late to properly recover from broken socket errors.
This would be great, but in this case we need to inform the user where the error came from. The errors are asynchronous and look like "SocketException: OSError: broken pipe" with an unrelated stack at best.
Are you using stack chains in your global error handler? That can go a long way towards helping to clarify the real source of an error.
The trick here is that you want to balance three things:
Are you using stack chains in your global error handler?
Thanks, I wasn't, added them to wrap serveRequests
call.
Agree with all 3 items!
Btw, what would be the best default error handling in the case of serveRequests
? Does always ignoring them work for both the server and the client? We have hard to reproduce socket exceptions happening intermittently, and inducing them manually in the socket code and always ignoring them in the runZonedGuarded wrapper sometimes leaves the server unresponsive.
I'd say ignoring errors is almost always the wrong behavior. I'd say either always logging, or always throwing a nicely-formatted error is probably the best option.
Makes sense, I meant logging and ignoring in this case. I wonder if there is any way to filter errors that are likely to render the server unusable in which case we probably should throw.
Linked all cases that can potentially contribute to ddr crashes.
Fix I used in dwds and webdev:
/// Handles [requests] using [handler].
///
/// Captures all sync and async stack error traces and passes
/// them to the [onError] handler.
void serveHttpRequests(Stream<HttpRequest> requests, Handler handler,
void Function(Object, StackTrace) onError) {
return Chain.capture(() {
serveRequests(requests, handler);
}, onError: onError);
}
Should we just add a boolean parameter here that forces it it handle and log all async errors instead of only doing so if there is no existing custom error handling zone? forceHandleAndLogUncaughtErrors
or maybe something a bit more concise?
@jakemac53 how about the boolean flag forceHandleAndLogUncaughtErrors
and a custom error handler, onError(e,s)
?
Sketch out a PR and we can review?
On Mon, Oct 4, 2021 at 4:30 PM Anna Gringauze @.***> wrote:
@jakemac53 https://github.com/jakemac53 how about the boolean flag forceHandleAndLogUncaughtErrors and a custom error handler, onError(e,s)?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dart-lang/shelf/issues/202#issuecomment-933933123, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEFCVAKWPHQ7E2XD4D373UFI2IVANCNFSM5E6HYHSA .
@jakemac53 how about the boolean flag
forceHandleAndLogUncaughtErrors
and a custom error handler,onError(e,s)
?
Or possibly we could just add the onError
handler parameter, and if that is passed then we assume you want to handle all errors?
even better! I will submit a PR
Sent a sketch. Let me know what you think!
serveRequests
does not always wrap listening to http requests in arunZoneGuarded
wrap:https://github.com/dart-lang/shelf/blob/6e38575985a8094f4ac172228a6711196d9ae91e/lib/shelf_io.dart#L73 https://github.com/dart-lang/shelf/blob/6e38575985a8094f4ac172228a6711196d9ae91e/lib/src/util.dart#L16
This leads to hard to debug async exceptions later if something up its call chain adds its own
runZoneGuarded
wrapper - in that case exceptions stop being caught and ignored insideserveRequests
and fall to the next root zone exception handler, which unexpectedly changes to logic of the calling code. Note that adding such top level handler will change behavior of allserveRequests
calls, including the ones coming from imported libraries, which makes fixing those new issues particularly complicated.Can we always run code inside
serveRequests
in arunZoneGuarded
, and provide it with an error handler instead for the cases where error behavior needs to be controlled by the user?