bottlerocket-os / bottlerocket

An operating system designed for hosting containers
https://bottlerocket.dev
Other
8.64k stars 508 forks source link

Conntrack Limit Not Applied Due to Kube-Proxy Config Precedence in Bottlerocket on EKS #4221

Open bryanhsu00 opened 1 day ago

bryanhsu00 commented 1 day ago

Platform I'm building on: v1.23.0

What I expected to happen: When attempted to increase the conntrack limit by specifying settings.kernel.sysctl in userdata, as described in the Bottlerocket documentation.

I also followed the instructions to disable kube-proxy for modifying conntrack per the Bottlerocket EKS Quickstart Guide. The expectation was that the conntrack limit would be set to the value specified in userdata.

What actually happened:

This approach did not work as expected because, in EKS, a default kube-proxy-config is present, which takes precedence over command line parameters for kube-proxy.

containers:
  - command:
    - kube-proxy
    - --v=2
    - --config=/var/lib/kube-proxy-config/config  # (This takes precedence)
    - --conntrack-max-per-core=0
    - --conntrack-min=0

How to reproduce the problem:

  1. Set the following userdata:
[settings.kernel.sysctl]
    "net.netfilter.nf_conntrack_max" = "1000000"
  1. Modify the kube-proxy deployment YAML to include the following parameters:
containers:
  - command:
    - kube-proxy
    - --v=2
    - --config=/var/lib/kube-proxy-config/config
    - --conntrack-max-per-core=0
    - --conntrack-min=0
  1. Create a new node and log into the node use "cat /proc/sys/net/netfilter/nf_conntrack_max" to find the actual nf_conntrack_max value. In my case my node is c6g.xlarge and the value showed 131072
[ssm-user@control]$ cat /proc/sys/net/netfilter/nf_conntrack_max
131072

Solution:

Updating the kube-proxy-config ConfigMap resolves the issue. Here's an example of how to set the correct conntrack values:

conntrack:
  maxPerCore: 0
  min: 0
  tcpCloseWaitTimeout: 1h0m0s
  tcpEstablishedTimeout: 24h0m0s

I suggest updating the documentation to instruct users to modify the kube-proxy-config ConfigMap rather than relying on command line parameters for kube-proxy.