socketry / async-http

MIT License
298 stars 45 forks source link

Let end users handle failed streams #22

Closed bryanp closed 3 years ago

bryanp commented 5 years ago

When a streaming response is interrupted I get this in the log:

8.46s: <Async::Task:0x3ff946c60f00 incoming connection #<Addrinfo: [::1]:55671 TCP> failed>
      |   Errno::EPIPE: Broken pipe
      |   → <internal:prelude> 132
      |     <internal:prelude> 132
      |     /Users/bryanp/.rbenv/versions/2.5.3/lib/ruby/gems/2.5.0/gems/async-io-1.23.1/lib/async/io/generic.rb:208 in `async_send'
      |     /Users/bryanp/.rbenv/versions/2.5.3/lib/ruby/gems/2.5.0/gems/async-io-1.23.1/lib/async/io/generic.rb:54 in `block in wrap_blocking_method'
      |     /Users/bryanp/.rbenv/versions/2.5.3/lib/ruby/gems/2.5.0/gems/async-io-1.23.1/lib/async/io/generic.rb:148 in `write'
      |     /Users/bryanp/.rbenv/versions/2.5.3/lib/ruby/gems/2.5.0/gems/async-io-1.23.1/lib/async/io/stream.rb:137 in `write'
      |     /Users/bryanp/.rbenv/versions/2.5.3/lib/ruby/gems/2.5.0/gems/protocol-http1-0.7.0/lib/protocol/http1/connection.rb:232 in `block in write_fixed_length_body'
      |     /Users/bryanp/.rbenv/versions/2.5.3/lib/ruby/gems/2.5.0/gems/protocol-http-0.7.0/lib/protocol/http/body/readable.rb:63 in `each'
      |     /Users/bryanp/.rbenv/versions/2.5.3/lib/ruby/gems/2.5.0/gems/protocol-http1-0.7.0/lib/protocol/http1/connection.rb:225 in `write_fixed_length_body'
      |     /Users/bryanp/.rbenv/versions/2.5.3/lib/ruby/gems/2.5.0/gems/protocol-http1-0.7.0/lib/protocol/http1/connection.rb:290 in `write_body'
      |     /Users/bryanp/.rbenv/versions/2.5.3/lib/ruby/gems/2.5.0/gems/async-http-0.42.0/lib/async/http/protocol/http1/server.rb:79 in `each'
      |     /Users/bryanp/.rbenv/versions/2.5.3/lib/ruby/gems/2.5.0/gems/async-http-0.42.0/lib/async/http/server.rb:52 in `accept'
      |     /Users/bryanp/.rbenv/versions/2.5.3/lib/ruby/gems/2.5.0/gems/async-io-1.23.1/lib/async/io/socket.rb:99 in `block in accept_each'
      |     /Users/bryanp/.rbenv/versions/2.5.3/lib/ruby/gems/2.5.0/gems/async-io-1.23.1/lib/async/io/socket.rb:137 in `block in accept'
      |     /Users/bryanp/.rbenv/versions/2.5.3/lib/ruby/gems/2.5.0/gems/async-1.17.1/lib/async/task.rb:204 in `block in make_fiber'

Would it be possible to expose this to the end user to handle themselves?

ioquatix commented 5 years ago

Maybe... can you show me a failing test case so I understand exactly what you are thinking of?

bryanp commented 5 years ago

So far I haven't been able to write an isolated case that reproduces this. I see it intermittently when doing a streaming view render in my web framework. The most predictable way to reproduce is to hold down Cmd-R in Safari (seen it a couple times in Chrome as well).

What I'm doing is writing to a Async::HTTP::Body::Writable instance. I'd expect this code to raise an error, but wrapping my call to write in a begin...rescue isn't catching anything. Seems like the error is coming from deeper in the library and not surfacing in this case.

ioquatix commented 5 years ago

https://github.com/socketry/protocol-http/blob/master/lib/protocol/http/body/readable.rb#L60-L68

Readable#each (which is in your backtrace) propagates the error by calling #close(error). If the body is a writable body, next time you try to write to it, it will blow up as per the code you mentioned.

https://github.com/socketry/async-http/blob/f967328c81063ecaa570999cbf9a84d51394d54b/lib/async/http/body/writable.rb#L53-L63 shows that in some situations, the error is ignored. i.e. if the writable body is already closed/finished. So, maybe you are hitting this case, i.e. you've already completed writing the response, so there is no way we can propagate the error up.

bryanp commented 5 years ago

In that case, is there value in logging the error at all if it isn't actionable to the end user?

ioquatix commented 5 years ago

It gets logged no matter what because the exception isn't guaranteed to be handled by the body.

I think there is value in logging it because I think it represents abnormal behaviour for an HTTP/1 connection. If the RFC says this is normal acceptable behaviour of a client I'd be willing to ignore it I guess, but I'd need to investigate it.

Yet, you are right there is nothing you can do about it if you are done writing the body but the client disconnects between that point and reading all the data. But if you were writing an HTTP/1 client, and you encountered this on the server, it means you aren't being very graceful, so logging it makes sense.

People have complained about Falcon being noisy (e.g. timeout errors) - but I think anything which deviates from the spec (RFCs for example) should be logged as an error, even more so if nothing can be done at the time the event occurs.

ioquatix commented 5 years ago

N.B. All Async tasks log unhandled errors by default, unless someone called #wait in which case the error is not logged by that task but propagates up the task tree (i.e. #wait re-raises the exception).

ioquatix commented 5 years ago

One way to ensure all data is to refrain from invoking close. When #each is completed, it calls body.close. So, you could catch this yourself.

body.write "blah"
# We don't care any more
# body.close

# Wait until underlying stream has consumed everything
body.flush # Doesn't exist yet.

body.flush would wait until buffer was fully drained before continuing, but would require an additional condition to implement. But you would definitely catch any error if the underlying stream failed for some reason (at least in theory).

bryanp commented 5 years ago

I think I agree it should be logged but is error the best log level for this? Again, from what I can tell for my case there's nothing actionable so including it in the log output adds no value (actually I would argue it removes value). IMO we could get away with using warn or perhaps even a lower lower level.

ioquatix commented 5 years ago

All unhandled exceptions go out as error, it's actually in async not here. The only way to do what you suggest is to catch the error and log it differently. We'd have to figure out what errors constitute warnings or not. But, it's doable and it's not a stupid idea. The only problem with this is... when you catch errors like this, you need to be very careful where you catch it. If you are too far up the call chain, you might catch unrelated errors of the same type. What I mean by this, is that some other unrelated operation might cause EPIPE.

bryanp commented 5 years ago

I agree with everything you say here. How can I best help with this particular case?

ioquatix commented 5 years ago

I guess we need to investigate where to catch errors

ioquatix commented 3 years ago

Is this still a problem and if so, let's take a closer look at the error propagation.

bryanp commented 3 years ago

I haven't seen it recently, so we can probably close this.