Open TimWolla opened 3 years ago
You already have "http-request reject" to forcefully close the H2 connection. However I'd advise you against doing that for what you're describing: forcing a client to perform another TLS handshake is going to slow you server down, because TLS handshakes are way more expensive for servers than clients. Maybe you have a specific use case in mind ?
You already have "http-request reject" to forcefully close the H2 connection.
Oh, that's good to know, but it applies too early. I want to close the connection after successfully sending the response (i.e. effectively disable keep-alive).
However I'd advise you against doing that for what you're describing: forcing a client to perform another TLS handshake is going to slow you server down, because TLS handshakes are way more expensive for servers than clients.
Thanks for the warning. For this specific use case the odds are in my favor, though :smiley:
Maybe you have a specific use case in mind ?
Without going into too much detail: I want to protect the backend against a single (or small number of) clients tying up all the resources just by sending a large number of requests to a dynamically generated endpoint via keep-alive connections. The backend (for technical reasons) serves both dynamically generated responses as well as static files and legitimate browser clients sometimes tend to generate a large request rate (a single dynamically generated HTML file + all the images and JavaScript referenced there). For that reason I can't just send a 429 once the request rate exceeds some threshold, it would impact legitimate traffic (images or JavaScript might fail to load). I also can't determine in HAProxy whether a request will be dynamically generated or not, only the backend knows that.
By forcing the client out of keep-alive when the request rate becomes too high a regular browser should notice nothing (especially once most of the files are cached), while these rogue clients will need to spend more resources by TCP handshaking and TLS handshaking. Additionally this allows other layers to detect this type of traffic, because new TCP connections are more visible than requests over a single keep-alive connection.
I see. Thanks for the context. During the development of the return
action we discussed the possibility to implement the notion of a "class" of response, which would or would not produce a log, be counted as an error, or cause the connection to be closed. We later figure that most of these were already addressable using one more rule or by completing with a deny rule, even though for the long term it would be nice to have that. But I'm seeing that it wouldn't have been sufficient because in your case it's not a return but a rule that plugs on top of traffic.
In your case, since you want to deliver the response correctly, it's more complicated, because no such code path exists to decide to send a GOAWAY frame after a valid response has been sent. And the only way we currently have to emit a GOAWAY frame is an error (which is what is done when using the reject rule). Maybe we should figure a way to send a GOAWAY frame with no error and make sure we don't keep track of it so that we're not tempted to abort too early the remaining processing. This would correspond to a "graceful connection closure" and could even be used to kill all idle H2 client connections on soft-stop or reload. But this requires some changes to the error tracking code so that we at least know what's the latest stream we've agreed to handle and ignore the other ones (because GOAWAY is a promise not to handle further streams so that client retry is safe).
Maybe we should figure a way to send a GOAWAY frame with no error
This would be great, yes.
even be used to kill all idle H2 client connections on soft-stop or reload
I thought that a reload would already kill the H2 connections, because otherwise the old worker might never leave. Is that not the case?
On Fri, Nov 20, 2020 at 03:42:53AM -0800, Tim Düsterhus wrote:
even be used to kill all idle H2 client connections on soft-stop or reload
I thought that a reload would already kill the H2 connections, because otherwise the old worker might never leave. Is that not the case?
I thought it was the case but failed to find it in the code. So I'm assuming it's closing on idle timeout.
There are so many things to do at the same time :-(
Willy
The documentation claims that for timeout client
and timeout client-fin
on H2 we send GOAWAY. Maybe that's a good starting point (assuming that this is true).
But yeah it would be nice to also send it:
Sure, on the moment we know we want to immediately close the connection, we attempt to send a GOAWAY (and timeout qualifies as this). Technically speaking the GOAWAY is currently sent as an error and sets the error condition on the connection, preventing it from further processing streams. This is why in the current state it's not compatible with the graceful close that Tim needs.
I agree with the points you noted, except the hard-stop-after since this one is a violent termination to make sure we really quit. So there will not be any scheduling anymore to try to send anything.
This feature would also benefit horizontally scaled instances where layer 4 load balancing (ECMP+L4) really only happens one time (this may be because the Proxy Team does control the client code). In the event of a rolling update, or dynamic updates to a large cluster, without a "forced" L4 re-balance the heavy connections would stick to the first set of servers that were "up first" and the instances that got their updates last would be under utilized. To solve this in HTTP1.1 land we randomly append a Connection: close
header to clients. This causes them to open another connection. In our case the cost of leaving the connections pinned to the "first up" server is significantly more expensive than randomly closing connections. Adding this functionality would greatly improve HAProxy's ability to force rebalance the L4. Today we are able to ask clients to randomize connections, however this doesn't necessarily scale when the Proxy Team doesn't have direct access to the client team or teams. It might seem the easiest solution to move the Proxy onto the client, however that is not always possible.
Here are the results from our Connection: close
header addition
Yes I agree. I've discussed my suggestion about how to proceed to kill them on reload and maybe we'll have this. I'm not saying it will solve your needs, but it could make it easier to deliver some signals to all of them, asking to perform some cleanup, something that is not possible for now without requests traffic.
@MillsyBot How did you implement that random close connection? Would you mind sharing the config?
frontend your-awesome-frontend
http-response set-header Connection Close if { rand(10000) lt 1 }
Just keep in mind rfc7230 insomuch that each proxy hop with strip all connection headers before forwarding so you will want to set the header on the proxy hop just before the client.
Maybe this topic is related to https://github.com/haproxy/haproxy/issues/5 ?
No that's a different topic. #5 was about hacking the protocol to try to figure when the peer received the GOAWAY and the connection can be safely closed. Here instead we'd just send an advisory GOAWAY (we'd announce 2^31-1 as the last accepted stream) so that the client is encouraged to stop using the connection ASAP and close it. It's not ideal either from a TCP perspective because it will leave the connection in TIME_WAIT on the client, but on the other hand, if the client wants, it can send a GOAWAY in turn and let us close the connection.
What should haproxy do differently? Which functionality do you think we should add?
I'd like to be able to dynamically and cleanly tear down a client's TCP connection as part of a HTTP response, e.g.
What are you trying to do?
I'd like to force abusive clients to re-establish a new TCP connection and perform a new TLS handshake to slow them down.
Sending a
connection: close
orGOAWAY
after sending the response to the current request should not affect legitimate traffic, apart from possibly introducing additional latency due to the reconnections. Specifically a legitimate user must not see requests failing like they would when I start sending a429 too many requests
. Injecting additional redirects as suggested in the mailing list thread does not play super nicely with POST requests, would require me to track additional state and is cheaply handled on the client side.see: https://www.mail-archive.com/haproxy@formilux.org/msg38828.html
Output of
haproxy -vv
anduname -a