envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.97k stars 4.81k forks source link

http: want max-connection-duration variant that will only send Connection:close responses and never drain from proxy #34356

Closed jmarantz closed 2 months ago

jmarantz commented 5 months ago

Title: http: want max-connection-duration variant that will only send Connection:close responses and never drain from proxy

Description: Currently, when max-connection-duration is reached for H1, the proxy will send a Connection:close response on the next response. However, if the client does not close the connection within the drain timeout, the conection will be forcibly closed.

This creates a scenario where a client that was idle may try to initiate a new request on the connection while Envoy is closing it. If this request is a POST it is likely not to be retried.

I would like a variant of max-connection-duration that never closes the connection from the server side, and only relies on the client to close the connection on the next Connection:close. This suites the use-case of where we have scaled up the number of Envoys able to handle a request, but the clients are keeping live connections to the original, smaller set of Envoys, and their connections never get redistributed to the new Envoys. In this scenario, if there are no requests, then we don't really have a connetion distribution problem, so it's fine to just leave the connection open until the next request.

[optional Relevant Links:]

Any extra documentation required to understand the issue.

kyessenov commented 5 months ago

If we don't drain the connections, wouldn't this require to also keep the associated config for the listener indefinitely? Drain happens regularly with LDS updates, so if we don't discard old LDS resources, we'll accumulate them in Envoy and leak memory.

jmarantz commented 5 months ago

This would be no worse than the configuration not including max-connection-duration, right?

I'm not suggesting never to drain for any reason; just not to drain based on this new variant of max-connection-duration.

One way to implement this I guess is to have a separate drain-timeout that we use for max-connection-duration, defaulting to the main drain-timeout.

kyessenov commented 5 months ago

I see, so you want to separate max-connection-duration handling from the drain sequence. LDS-driven drain will work as before. That makes sense to me.

krajshiva commented 5 months ago

+1 - Since draining phase can be triggered by multiple triggers (for example: idle timeout) it will make sense to separate the concerns and have MCD specific drain period as explained. This will allow client using http1 connection to terminate it by acting on connection: close header in response, sort of similar to GoAways signalling by server to client in case of http2 drainout phase.

github-actions[bot] commented 4 months ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

github-actions[bot] commented 4 months ago

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.