swift-server / http

:warning: Historical HTTP API - please use https://github.com/swift-server/async-http-client instead
Apache License 2.0
702 stars 48 forks source link

Write Callback Block on Queue vs on Write #36

Open carlbrown opened 7 years ago

carlbrown commented 7 years ago

I'm pulling comment from @helje5 from #28 over into this issue, because that issue is getting cluttered.

The worse aspect is that this calls the done-cb before the queued write has actually finished! This needs to be passed along the stream, that's the whole point.

My reason for not doing this was two-fold:

  1. I was concerned about side-effects from passing the completion blocks into the Socket code, either in the form of memory leaks or in the form of deadlocks.
  2. I don't see a case where it matters. The best this code can do is to hold onto the CB until after the socket write() completes. But, as the documentation for write(2) explicitly says "A successful return from write() does not make any guarantee that data has been committed," holding onto the CB that long doesn't provide any more reliability or information that if it was called where it is now.

There's no guarantee the connection won't be closed before the bytes that were queued will be written to the socket, but there's no guarantee they're going to make it to the far end of the TCP connection either.

So why does it matter whether the callback happens after the bytes are queued for write() or once write() had completed? The guarantee (or lack thereof) is the same either way. By Occam's Razor then, I think the simpler design is better.

I might well be missing something here. If so, I'd love to know what it is.

Thanks

helje5 commented 7 years ago

You don't need a done-CB at all if it doesn't provide 'done' functionality. What purpose does it solve in the current setup, none at all - just drop it then ;-)

The reason we added the CB to Johannes original proposal was to add support for proper back pressure handling. Which is somewhat necessary for async based servers ...

P.S.: When talking about sockets, refer to man 2 send, not write. The "does not make any guarantee that data has been committed" refers to filesystem buffers that may not have been flushed to disk. The semantics of a write on a socket file handle are different.

carlbrown commented 7 years ago

The reason we added the CB to Johannes original proposal was to add support for proper back pressure handling. Which is somewhat necessary for async based servers ...

I need to understand better how this would work, because even if I waited to call the CB until after send(), I'm don't understand how that would help. Is there documentation or a book or something you can point me to?

The first indication of congestion we would get (I think) is that the send() call would return fewer bytes sent than requested (in which case we spin up a DispatchSourceWrite to handle the retry). In that case, would the CB just be held until the DispatchSourceWrite was done? Or should there be a "partially written" value for the callback? And then what would the application do with that information?

I'm happy to fix this (or review Pull Requests), but I'm not clear on what would need to happen.

P.S.: When talking about sockets, refer to man 2 send, not write. ...

You are, of course, correct - sorry. That's what I get for writing up bug while I'm watching a hurricane make landfall live on my TV. (I'm in a safe part of Texas, but I have family along the coast that I'm worried about, so I wasn't fully engaged with what I was writing. My apologies.).

helje5 commented 7 years ago

You want information on back pressure? I suppose it is kinda self explaining by its water analogy :-) You can only put more stuff in when you take stuff out. Well, the basic idea is that you suspend a stream of input data until the output stream is ready to take on more data (usually w/ adding additional buffers). Otherwise you can quickly spool up much more data than the backend can actually process (in a timely manner or before running out of buffer memory). Resources, hm. I like the Stream tutorial, it may explain it, but is a good resource on async-io in any case.

Doing a quick G search this also seems to explain it: Node Backpressuring in Streams, that also kinda looks good: Some thoughts on asynchronous API design.

As a very basic example, say you are piping data from a DispatchIO source to the HTTP server output stream. It may look something like this (not working code):

source.onData { data in
  source.suspend()
  httpResponse.write(data) {
    source.resume()
  }
}

(as mentioned you would usually use buffers, if you have a lot of time, feel free to browse to the Noze.io implementation of GCD based streams ;-) )

carlbrown commented 7 years ago

So the idea is just that the application has the opportunity to pause until after socket().send()has completed?

Ok, so let me think out loud here:

If I've got that right, it makes sense to me, and it's worth risking potential memory leaks to support.

I'll take a pass at this and see what I can do.

Thanks for the feedback. I hasn't thought the scenario all the way through.

helje5 commented 7 years ago

So the pathological case is that we have a server with a really large receive bandwith and a small send bandwidth

Yeah, this is a very common scenario, the most obvious example just serving a plain file (though this has other optimisations). The outbound is not always but quite often slower than the producer (the application, back office db/cache server etc).

carlbrown commented 7 years ago

Status update: I did a naïve implementation of this at https://github.com/carlbrown/http/commit/5d503a3e9166956fdd4cd7774b8aa3fadb236432 and it didn't work - I got duplicated bytes in the output, so the completionBlock was getting called more than once.

I started trying to troubleshoot it, but I kept needing to make (temporary, debugging, mostly logging) changes to BlueSocket, and its complicated design was slowing me way down, so I switched gears and ripped it out, leading to #39 (to be clear - I don't think BlueSocket is the problem, it was just making it more difficult for me to find the problem than I was willing to tolerate).

Once #39 is merged, I'll revisit this.