Closed bryanp closed 3 years ago
Maybe... can you show me a failing test case so I understand exactly what you are thinking of?
So far I haven't been able to write an isolated case that reproduces this. I see it intermittently when doing a streaming view render in my web framework. The most predictable way to reproduce is to hold down Cmd-R
in Safari (seen it a couple times in Chrome as well).
What I'm doing is writing to a Async::HTTP::Body::Writable
instance. I'd expect this code to raise an error, but wrapping my call to write
in a begin...rescue
isn't catching anything. Seems like the error is coming from deeper in the library and not surfacing in this case.
https://github.com/socketry/protocol-http/blob/master/lib/protocol/http/body/readable.rb#L60-L68
Readable#each
(which is in your backtrace) propagates the error by calling #close(error)
. If the body is a writable body, next time you try to write to it, it will blow up as per the code you mentioned.
https://github.com/socketry/async-http/blob/f967328c81063ecaa570999cbf9a84d51394d54b/lib/async/http/body/writable.rb#L53-L63 shows that in some situations, the error is ignored. i.e. if the writable body is already closed/finished. So, maybe you are hitting this case, i.e. you've already completed writing the response, so there is no way we can propagate the error up.
In that case, is there value in logging the error at all if it isn't actionable to the end user?
It gets logged no matter what because the exception isn't guaranteed to be handled by the body.
I think there is value in logging it because I think it represents abnormal behaviour for an HTTP/1 connection. If the RFC says this is normal acceptable behaviour of a client I'd be willing to ignore it I guess, but I'd need to investigate it.
Yet, you are right there is nothing you can do about it if you are done writing the body but the client disconnects between that point and reading all the data. But if you were writing an HTTP/1 client, and you encountered this on the server, it means you aren't being very graceful, so logging it makes sense.
People have complained about Falcon being noisy (e.g. timeout errors) - but I think anything which deviates from the spec (RFCs for example) should be logged as an error, even more so if nothing can be done at the time the event occurs.
N.B. All Async
tasks log unhandled errors by default, unless someone called #wait
in which case the error is not logged by that task but propagates up the task tree (i.e. #wait
re-raises the exception).
One way to ensure all data is to refrain from invoking close
. When #each
is completed, it calls body.close
. So, you could catch this yourself.
body.write "blah"
# We don't care any more
# body.close
# Wait until underlying stream has consumed everything
body.flush # Doesn't exist yet.
body.flush
would wait until buffer was fully drained before continuing, but would require an additional condition to implement. But you would definitely catch any error if the underlying stream failed for some reason (at least in theory).
I think I agree it should be logged but is error
the best log level for this? Again, from what I can tell for my case there's nothing actionable so including it in the log output adds no value (actually I would argue it removes value). IMO we could get away with using warn
or perhaps even a lower lower level.
All unhandled exceptions go out as error, it's actually in async
not here. The only way to do what you suggest is to catch the error and log it differently. We'd have to figure out what errors constitute warnings or not. But, it's doable and it's not a stupid idea. The only problem with this is... when you catch errors like this, you need to be very careful where you catch it. If you are too far up the call chain, you might catch unrelated errors of the same type. What I mean by this, is that some other unrelated operation might cause EPIPE
.
I agree with everything you say here. How can I best help with this particular case?
I guess we need to investigate where to catch errors
Is this still a problem and if so, let's take a closer look at the error propagation.
I haven't seen it recently, so we can probably close this.
When a streaming response is interrupted I get this in the log:
Would it be possible to expose this to the end user to handle themselves?