Closed xstefanox closed 1 year ago
A good practice is to always set a timeout for everything, so changing this for all the uses is not correct. If you provide a non-default way to set for your case no timeout, it can be accepted, but not unconditionally.
Hi @pedrobaeza,
no, we are not going to remove the timeout for every route, but only for the /events
route.
The reason why I didn't provide an environment variable to override this timeout is because this API call is meant to never terminate by design.
The /events
route is a sort of custom implementation of Server Sent Events made by Docker: the client asks for the streaming of events emitted by Docker and listens to them forever, because an event stream is not meant to terminate unless the server is shutdown or restarted.
If you manually GET /events
from your terminal connecting directly to the Docker daemon, you will find out that your client never terminates; instead, if you connect through the Docker Socket Proxy, the connection is terminated after 10 minutes in a non-clean way, which is an incorrect behaviour for an event stream.
Example output using HTTPie:
❯ http GET http://localhost:2375/events
HTTP/1.1 200 OK
api-version: 1.42
content-type: application/json
date: Wed, 20 Sep 2023 07:00:27 GMT
docker-experimental: false
ostype: linux
server: Docker/23.0.6 (linux)
transfer-encoding: chunked
http: error: ChunkedEncodingError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))
This behaviour causes the crash of some services that expect to consume the Docker events stream, for example the one that I mentioned in my previous post, which terminates with a fatal error and is then automatically restarted exactly every 10 minutes.
I agree with you that setting reasonable network timeouts is a good practice, but for this particular use case, applying a timeout of any value is a nonsense, that's the reason why I'm simply proposing to remove the timeout at all.
OK, thanks for the explanations. Seems reasonable. @yajo as you were the creator of this, do you see it correct?
The current implementation defines a
10m
server timeout for every route exposed by the Docker API.Detected issue
There are some containers that need to listen for Docker events and trigger particular action when an event is received, for example swarm-cronjob.
The
/events
route is meant to stream forever but the timeout imposed by HAProxy is abruptly closing the connection when the timeout is reached.Proposed solution
With this change, HAProxy will be configured with no timeout only for the
/events
route, while preserving the existing configuration for other routes.Implementation details
A dedicated backend with timeout set to
0
is configured in HAProxy; this backend is used to handle only the/events
route, which is matched using the same regular expression already used to choose whether the route is accessible or not depending on the related environment variable.