CrowdHailer / Ace

HTTP web server and client, supports http1 and http2
https://hex.pm/packages/ace
MIT License
305 stars 26 forks source link

There's no request backpressure #114

Closed nietaki closed 6 years ago

nietaki commented 6 years ago

I assumed Ace, when you use handle_data does backpressure - doesn't read off the socket until each data chunk is handled (using :gen_tcp's :passive or {:active, :once} modes), give or take some buffer.

To my surprise taking any time when handling data chunks causes a message queue to build up for the handler.

A sample module to reproduce the issue:

defmodule FileBeam.WWW.Upload do
  use Raxx.Server

  @impl Raxx.Server
  def handle_head(request = %{path: _path}, _state) do
    IO.inspect(request)
    {[], :state}
  end

  @impl Raxx.Server
  def handle_data(chunk, _state) do
    IO.puts("received chunk")
    IO.puts("chunk size: #{byte_size(chunk) / 1024} KB")
    Process.sleep(100)
    {[], :state}
  end

  @impl Raxx.Server
  def handle_tail(_trailers, _state) do
    response(:no_content)
  end
end

The end result is the mentioned message buildup, as per the screenshot from wobserver:

screenshot 2018-08-23 20 00 56

When I was testing it with some local files, I think the whole file got uploaded as quickly as the connection allowed, but 204 wasn't received until all the data chunks were processed.

I'd be happy to work on a fix, but I didn't have the chance to look into the implementation too much, I'd be happy to be pointed in the right direction.

CrowdHailer commented 6 years ago

Interesting. I think the server does make use of active: :once However it looks like it only then needs backpressure when sending the message to the worker.

https://github.com/CrowdHailer/Ace/blob/master/lib/ace/http1/endpoint.ex#L53-L57

Essentially the worker needs to ack the server process when it is ready to accept more message parts. However because there is backpressure when the worker sends messages to the server. you can't use a Gen.call. call in both directions between two processes can deadlock.

Also did you want to send the 204 before all the data was received?

Finally was this a HTTP/1 or HTTP/2 endpoint. can't tell because the worker is the same in both

nietaki commented 6 years ago

Forgot to mention, this was HTTP/1.

Also did you want to send the 204 before all the data was received?

Sorry, I didn't mean to leave that impression, I was just trying to pain the picture of what was happening. What I'm trying to do is to do some processing as the file is being uploaded and only receive as fast as the file is being processed. I want to send 204 after the whole file is uploaded and processed.

Essentially the worker needs to ack the server process when it is ready to accept more message parts. However because there is backpressure when the worker sends messages to the server. you can't use a Gen.call. call in both directions between two processes can deadlock.

I see. I don't think doing it the naive way (sticking a receive in a loop before https://github.com/CrowdHailer/Ace/blob/master/lib/ace/http1/endpoint.ex#L59 ) would be robust enough - it would probably need to be another handle_info to set the socket to active. I'll play around with it when I have the chance and see if I can come up with with a reasonably clean approach and to see how it would fit in with the HTTP/2 side of things - I haven't looked at that yet.

BTW, I'm not entirely clear on what the abstraction is - what the exact responsibilities of Channel, Endpoint, Server and Worker are supposed to be.

CrowdHailer commented 6 years ago

Endpoint and Sever are the same process. Endpoint is just code that is useful to both a server and a client.

Worker is the abstraction that deals with translating one request to one response. It is the same process regardless of if the server is for a HTTP/1 or HTTP/2 connection.

A channel is just a reference the worker has back to the connection. it contains the pid of the server process but also stream identifier in the HTTP/2 case. The idea is that the worker can send data to the channel and not know if it is backed by a HTTP/1 or HTTP/2 endpoint/server

nietaki commented 6 years ago

Thanks for the clarification, it's very helpful.

After looking into it some more, I don't think this backpressure would make much sense in context of HTTP/2 - the client should be able to initiate a new stream even if all the current ones haven't caught up with processing, so we can't block on tcp in a similar way.

I'm thinking of having the Worker send the acks to the Endpoint process conditionally, if the negotiated protocol isn't http2. The Endpoint will block on tcp if the amount of request chunks unhandled by the client is >= a certain buffer/msgQ size (1 or more).

Let me know what you think or brace for a PR ;)

CrowdHailer commented 6 years ago

HTTP/2 has a concept of backpressure in the spec. https://tools.ietf.org/html/rfc7540#section-5.2 In the future this should probably be exposed in the API somehow. though I have no idea.

Probably best to start with the simplest thing, which is just and ack and add more as need. I will stay braced for PR