mesosphere / marathon-lb

Marathon-lb is a service discovery & load balancing tool for DC/OS
Apache License 2.0
449 stars 301 forks source link

Possibly wrong "long lived socket" configuration documentation #649

Closed groyoh closed 4 years ago

groyoh commented 4 years ago

We recently configured our marathon-lb to have HAProxy in front of our Postgres databases. We started seeing some timeout on our connections between our micro services and our database. After some investigation, we decided to follow the long-lived connection documentation and set the timeout server to a big value like 999999m. Despite that, we kept on seeing connection timeout after around 1h. I investigated further by sniffing the TCP connection and saw that after exactly 1h HAProxy initiated a FIN. This lead me to look deeper into the HAProxy and realized that this matched perfectly the default timeout tunnel configuration which is 3600s.

By looking at HAProxy documentation, I noticed that timeout tunnel is the actual timeout used when non-HTTP TCP connections are established:

The tunnel timeout applies when a bidirectional connection is established between a client and a server, and the connection remains inactive in both directions. This timeout supersedes both the client and server timeouts once the connection becomes a tunnel. In TCP, this timeout is used as soon as no analyser remains attached to either connection (eg: tcp content rules are accepted).

We then change our config to use the timeout tunnel instead of timeout server and this solved our timeout issues. Therefore I think the wiki should be updated as follows:

{
  "id":"app",
  "labels":{
    "HAPROXY_GROUP":"external",
-    "HAPROXY_0_BACKEND_HEAD":"backend {backend}\n  balance {balance}\n  mode {mode}\n  timeout server 30m\n"
+    "HAPROXY_0_BACKEND_HEAD":"backend {backend}\n  balance {balance}\n  mode {mode}\n  timeout tunnel 30m\n"
  }
}
jkoelker commented 4 years ago

You are absolutely correct it should be timeout tunnel. I have updated the wiki as suggested. Thanks for taking the time to research it and report it!