We recently configured our marathon-lb to have HAProxy in front of our Postgres databases. We started seeing some timeout on our connections between our micro services and our database. After some investigation, we decided to follow the long-lived connection documentation and set the timeout server to a big value like 999999m. Despite that, we kept on seeing connection timeout after around 1h. I investigated further by sniffing the TCP connection and saw that after exactly 1h HAProxy initiated a FIN. This lead me to look deeper into the HAProxy and realized that this matched perfectly the default timeout tunnel configuration which is 3600s.
By looking at HAProxy documentation, I noticed that timeout tunnel is the actual timeout used when non-HTTP TCP connections are established:
The tunnel timeout applies when a bidirectional connection is established
between a client and a server, and the connection remains inactive in both
directions. This timeout supersedes both the client and server timeouts once
the connection becomes a tunnel. In TCP, this timeout is used as soon as no
analyser remains attached to either connection (eg: tcp content rules are
accepted).
We then change our config to use the timeout tunnel instead of timeout server and this solved our timeout issues. Therefore I think the wiki should be updated as follows:
We recently configured our marathon-lb to have HAProxy in front of our Postgres databases. We started seeing some timeout on our connections between our micro services and our database. After some investigation, we decided to follow the long-lived connection documentation and set the
timeout server
to a big value like999999m
. Despite that, we kept on seeing connection timeout after around 1h. I investigated further by sniffing the TCP connection and saw that after exactly 1h HAProxy initiated aFIN
. This lead me to look deeper into the HAProxy and realized that this matched perfectly the defaulttimeout tunnel
configuration which is3600s
.By looking at HAProxy documentation, I noticed that timeout tunnel is the actual timeout used when non-HTTP TCP connections are established:
We then change our config to use the
timeout tunnel
instead oftimeout server
and this solved our timeout issues. Therefore I think the wiki should be updated as follows: