proposal: net/http: configurable shutdown idle timeout for Server

dany74q commented 3 days ago

Proposal Details

Currently, when gracefully terminating an http1 server using http.Server#Shutdown(), it will:

Close all listeners
Call user-supplied shutdown hooks in goroutines (w/o waiting for them to finish - https://github.com/golang/go/issues/32116)
Treat active connections that didn't receive request headers for 5 seconds - as idle (https://github.com/golang/go/issues/22682, https://github.com/golang/go/issues/59037)
Close idle connections
Loop from 2 until all connections are idle and closed

I'd like to hear the community's thoughts about minor changes to the points above, which I believe could help implement a safer graceful termination flow:

1. User-supplied hooks

Would it make more sense to wait for the hooks before exiting Shutdown() ? otherwise we don't guarantee the hooks will execute before Shutdown() is finished, and potentially the process is terminated.

Proposal

Manage hooks in waitgroup in http.Server#Shutdown()
Wait for waitgroup to finish before returning

2. Hard-coded 5 seconds for treating active connections as idle

The code mentions:

// Issue 22682: treat StateNew connections as if
// they're idle if we haven't read the first request's
// header in over 5 seconds.

Proposal

Same logic as today, but utilize ReadHeaderTimeout and fall-back to the default of 5 seconds if it isn't set

3. Closing idle connections

For http1, when connections are reused, they are marked as idle as soon as a response is served; this means that reused connections can toggle rapidly between active / idle when serving requests, and get shutdown if caught idle in the loop above.

I think we should take a small nuance into account for graceful termination - a client might just be before re-using an idle connection, while the server closes it, leading to an EOF on the client side; for non idempotent requests (say, POST), this wouldn't be retried automatically by clients - the request is usually dropped, even if there's a new server already accepting requests (say, a version roll of a server, gracefully terminating the former).

This is the nature of http1, of course. I would like to propose a minor change that would help servers better control this scenario, as I've found, could appear in critical applications (e.g. security-enforcing k8s admission controllers)

Proposal

Allow controlling the window of - when an idle connection is treated as idle - by introducing a time.Duration, where only connections that had been idle for at least that long will be closed (similar to IdleTimeout, but for closing)

Introduce a new field to http.Server - CloseIdleMinDuration
In closeIdleConns, get the connection state change time from conn#getState
Close the connection only if CloseIdleMinDuration had passed since the state's unixSec
The loop continues until it's done or the ctx is cancelled

With this, the server can configure that, say, it would only close connections that were idle for 30 seconds - increasing the odds that the connection is truly idle, and is not about to be active in a brief moment.

This has similar, but different semantics to IdleTimeout, which would close connections at runtime - as it would only scope it to the termination period; this could make more sense for http1, as we can never fully guarantee a connection is not just about to be re-used.

Appreciate your Time !

golang-nuts discussion: https://groups.google.com/g/golang-nuts/c/G6Ct8kzZiRU

gabyhelp commented 3 days ago

Related Issues and Documentation

_{(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)}

seankhliao commented 3 days ago

RegisterOnShutdown is documented as: "This can be used to gracefully shutdown connections that have undergone ALPN protocol upgrade or that have been hijacked.". If there's no more connections and the shutdown proceeds, then there's nothing for the hook to do anyway.

seankhliao commented 3 days ago

Also, please include proposed API changes in your proposal.

dany74q commented 3 days ago

Thanks Sean !

Valid point about the hooks - I thought it might make sense for Shutdown() to wait on all hooks to finish before returning, just in-case the caller set-up hooks which outlive the server itself; the doc is formulated well though, so this is probably a moot point.

I've edited the original message to be more concrete on the proposals.

golang / go