DataDog / datadog-process-agent

Datadog Process Agent
https://datadoghq.com
20 stars 9 forks source link

Reduce TCP expiry time for connection tracking #304

Closed sunhay closed 5 years ago

sunhay commented 5 years ago

10 minutes is a long time to hold onto a dead connection, especially for nodes with high connection churn (e.g. load balancers).

We're still looking into the root cause of why we're missing tcp_close events occasionally on nodes, but this will help alleviate a lot of the memory pressure added.

On one node that we've tracked with high memory usage, this will bring down the stored connections from ~45k connections to ~10k.

The downside is that if we want to track live connections that are discarded by an application (e.g. leaked connections), we will only have two minutes worth of data on this.