aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.21k stars 318 forks source link

[EKS] [request]: allow to configure ipvs kube-proxy mode #142

Closed dawidmalina closed 1 year ago

dawidmalina commented 5 years ago

Tell us about your request It would be nice to allow to switch from default iptables to ipvs kube-proxy mode.

Which service(s) is this request for? EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? Better load balancing across all pods covered by particular service.

voxxit commented 5 years ago

Any update on this one?

tabern commented 5 years ago

Hi - we haven't adopted this in EKS as it's GA but not the default for Kubernetes yet. We are researching being able to enable this with EKS.

whereisaaron commented 5 years ago

I'd like to have this sure, but I'd prefer to wait until it is the k8s default. I guess it could be an experimental option, but I am not sure AWS want to support both modes 😄

Even in k8s 1.14 IPVS is not the default (I think?) and the people are still smoothing over the rough edges of IPVS for the things that it breaks or that it don't support, e.g.

lbernail commented 5 years ago

For what it's worth, we have been running IPVS on large clusters (managed by ourselves) for about a year without problem except for graceful termination: we kept using a version without it because it was a bit unstable. We have fixed a few issues with graceful termination in the last few months and it seems almost ready now.

A few important PRs haven't been merged yet but should be soon:

If you test it and find any issue, let us know

arzarif commented 4 years ago

Not sure if this should be mentioned here or if it should comprise a separate request - would it make sense to optionally allow disabling the deployment of kube-proxy altogether?

The justification would be that kube-proxy itself is not a strict requirement with certain configurations. It would mirror the steps kubeadm took to make kube-proxy's deployment optional (for similar reasons).

tabern commented 4 years ago

@arzarif that's a topic I think that warrants a separate thread. Can you open a new issue?

SankarGopal77 commented 4 years ago

Hi, Could you please let me know if EKS is allowing to switch from iptables to ipvs as per this blog or work is still in progress?

https://medium.com/@jeremy.i.cowan/the-problem-with-kube-proxy-enabling-ipvs-on-eks-169ac22e237e

anoop2811 commented 4 years ago

Hi, is this coming anytime soon?

chris530 commented 4 years ago

Nice blog @SankarGopal77 !

yuhusn commented 4 years ago

Hi, Is there any progress

ntkjer commented 4 years ago

+1 on this issue

ntkjer commented 4 years ago

Hi, Could you please let me know if EKS is allowing to switch from iptables to ipvs as per this blog or work is still in progress?

https://medium.com/@jeremy.i.cowan/the-problem-with-kube-proxy-enabling-ipvs-on-eks-169ac22e237e

As mentioned in the article, this is already supported by replacing the following properties in each worker node's user-data script:

#cloud-config
packages:
 - ipvsadm
runcmd:
 - sudo modprobe ip_vs 
 - sudo modprobe ip_vs_rr
 - sudo modprobe ip_vs_wrr 
 - sudo modprobe ip_vs_sh
 - sudo modprobe nf_conntrack_ipv4
 - /var/lib/cloud/scripts/per-instance/bootstrap.al2.sh

And in the kube-proxy deamonset:

containers:
  - command:
    - /bin/sh
    - -c
    - kube-proxy --v=2 --kubeconfig=/var/lib/kube-proxy/kubeconfig --proxy-mode=ipvs --ipvs-scheduler=sed
    image: 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/kube-proxy:v1.13.8

I believe the opened issue is suggesting IPVS as default, as opposed to IP tables as these can cause contention on large-scale production systems.

selfieblue commented 4 years ago

W e are waiting IPVS. It's quire clear that is so important but we didnot get any reply or support from EKS Team. It's so sad for your customer.

rubroboletus commented 4 years ago

Is here any progress? We also need to use IPVS.

canhnt commented 4 years ago

FYI: In our case, we enable IPVS modules directly via UserData setting on top the latest AMI.

tschirmer commented 4 years ago

I've submitted this to the EKS AMI team. If it can be included in the base EKS AMI it'd be a matter of enabling it.

https://github.com/awslabs/amazon-eks-ami/issues/546

bitmexgmarkey commented 3 years ago

As mentioned in the article, this is already supported by replacing the following properties in each worker node's user-data script:

In my case this was not sufficient as the kube-proxy-config CM in kube-system took priority over arguments.

apiVersion: v1
data:
  config: |-
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    bindAddress: 0.0.0.0
    clientConnection:
      acceptContentTypes: ""
      burst: 10
      contentType: application/vnd.kubernetes.protobuf
      kubeconfig: /var/lib/kube-proxy/kubeconfig
      qps: 5
    clusterCIDR: ""
    configSyncPeriod: 15m0s
    conntrack:
      # max: 0 <-- I comment out this line as it causes a syntax error when the kube-proxy is loading; maybe caused by an upgrade to 1.18?
      maxPerCore: 32768
      min: 131072
      tcpCloseWaitTimeout: 1h0m0s
      tcpEstablishedTimeout: 24h0m0s
    enableProfiling: false
    healthzBindAddress: 0.0.0.0:10256
    hostnameOverride: ""
    iptables:
      masqueradeAll: false
      masqueradeBit: 14
      minSyncPeriod: 0s
      syncPeriod: 30s
    ipvs:
      excludeCIDRs: null
      minSyncPeriod: 0s
      scheduler: "lc" # <-- set to desired scheduler
      syncPeriod: 30s
    kind: KubeProxyConfiguration
    metricsBindAddress: 127.0.0.1:10249
    mode: "ipvs" # <-- modified from "iptables"
    nodePortAddresses: null
    oomScoreAdj: -998
    portRange: ""
    udpIdleTimeout: 250ms
kind: ConfigMap
metadata:
  labels:
    eks.amazonaws.com/component: kube-proxy
    k8s-app: kube-proxy
  name: kube-proxy-config
  namespace: kube-system
mlycore commented 3 years ago

Do we need to run kube-proxy --cleanup before we switch from iptables to ipvs?

sidewinder12s commented 3 years ago

Do we need to run kube-proxy --cleanup before we switch from iptables to ipvs?

Yes one of the blogs above mentions needing to run that or restart workers if you switch from default to IPVS. link

stevehipwell commented 3 years ago

@tabern is there any official progress on this?

lucioveloso commented 3 years ago

Note for using sed as ipvs scheduler you need to active the module:

sudo modprobe ip_vs_sed
debu99 commented 2 years ago

just curious, based on AWS EKS customer base, why this is no fixed yet, maybe their EKS use cases are quite small?

lobida commented 2 years ago

in my case, I updte the kube-proxy-config configmap mode: "ipvs" , after about 20mins my kube-proxy-config configmap would be rollback by kube-proxy addon. which could cause weird problem. some of kube-proxy in iptables mode

Christian-Schmid commented 2 years ago

in my case, I updte the kube-proxy-config configmap mode: "ipvs" , after about 20mins my kube-proxy-config configmap would be rollback by kube-proxy addon. which could cause weird problem. some of kube-proxy in iptables mode

Sadly I've got the exact same behavior. If you figure out why this is happening me (and probably others) would be interested in the solution :-)

stevehipwell commented 2 years ago

Are you using managed or self-managed addons?

Christian-Schmid commented 2 years ago

Are you using managed or self-managed addons?

Thanks you very much for the hint! I actually tried with the managed "kube-proxy" add-on of eks.

When not using the addon I could make the ipvs mode work and my changes did not get reverted *hooray*.

To make it run in that case however I also needed to load the necessary kernel modules on my own. (As described in the previously linked medium.com article; some package names changed since then):

sudo modprobe ip_vs 
sudo modprobe ip_vs_rr
sudo modprobe ip_vs_wrr 
sudo modprobe ip_vs_sh
sudo modprobe ip_vs_lc
sudo modprobe nf_conntrack
gotchipete commented 2 years ago

For anyone coming to this (and/or the afore-mentioned articles above) like I did many times before getting it working (some steps in different articles were unnecessary for me, but thank you to everyone for the articles and comments that helped me ultimately get it working!)...

...here are the steps as of the date of this comment that served to get ipvs-lc working for us, with our EKS managed node groups. Also, not sure it matters, but I used eksctl to provision my clusters initially. I am also using arm (Graviton 2) instances, with the default AMI.

Step 1 : Create a new launch template version with user data to install ipvs dependencies

In the eks console, click your cluster name, then click the Compute tab, click your nodegroup name (nodegroup-1 for me)->click the launch template name under "launch template" e.g. your-cluster-name-nodegroup-nodegroup-1) - then modify it (click actions -> modify template (create a new version)). Keep everything that is in the template already except for adding a description e.g. "install ipvs-lc dependencies", and adding the following to the "user data" section (under Advanced options) - note that since this is for managed node groups, the user data has to be in MIME format, as below (this tripped me up for a few)!

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="

--==MYBOUNDARY==
Content-Type: text/x-shellscript; charset="us-ascii"

#!/bin/bash
yum install -y ipvsadm
ipvsadm -l
modprobe ip_vs 
modprobe ip_vs_rr
modprobe ip_vs_wrr 
modprobe ip_vs_sh
modprobe ip_vs_lc
modprobe nf_conntrack

--==MYBOUNDARY==--

Step 2: Apply the new launch template

Back in the eks console, (go to aws eks->click cluster name->click Compute tab) - you'll now see (if you couldn't before) under Node Groups that you can click "change version" next to the launch template name. Click to change the version to the version you just created and choose the default "rolling update" to apply the new launch template version in a no-downtime fashion. It takes a bit... maybe 10 min or so? Depends on the number of nodes. You can continually run kubectl get nodes in the cli to watch for progress.

Step 3: Edit kube-proxy-config configmap

When the above is finished, run: kubectl -n kube-system edit cm kube-proxy-config Do the following:

Step 4: Apply new kube-proxy parameters by editing the daemonset for kube-proxy

Run:

kubectl -n kube-system edit ds kube-proxy

==> Change from
      containers:
      - command:
        - kube-proxy
        - --v=2
        - --config=/var/lib/kube-proxy-config/config

==> To
      containers:
      - command:
        - kube-proxy
        - --v=2
        - --proxy-mode=ipvs
        - --ipvs-scheduler=lc
        - --config=/var/lib/kube-proxy-config/config
        env:
        - name: KUBE_PROXY_MODE
          value: ipvs

Step 5: Verify it's working

Ssh to one of the worker nodes (click through to an ec2 instance via one of your nodes in the nodegroup you just updated, get the ip address, and ssh ec2-user@the-nodes-ip-address) and once connected, run command:

$ sudo ipvsadm -l

Output should look like:

IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  ip-172-2-0-10.us-east-2.com lc
  -> ip-10-2-2-4.us-east-2.com Masq    1      0          0         
  -> ip-10-8-9-0.us-east-2.comp Masq    1      0          0         
TCP  ip-172-2-32-23.us-east-2.c lc
  -> ip-10-1-2-3.us-east-2.comp Masq    1      0          0         
  -> ip-10-5-6-7.us-east-2.com Masq    1      0          0         
TCP  ip-172-20-6-7.us-east-2.co lc
  -> ip-10-8-7-6.us-east-2.com Masq    1      0          0         
  -> ip-10-8-6-5.us-east-2.comp Masq    1      0          0     

...
prashanth-volvocars commented 1 year ago

Hi,

Has anyone tried enabling IPVS for EKS with Fargate nodes?

z0rc commented 1 year ago

With release of addon configuration (https://aws.amazon.com/about-aws/whats-new/2022/12/eks-add-ons-supports-advanced-configuration/) I see that kube-proxy addon can be configured natively to use ipvs. Haven't tested it though.

Current config validation schema:

{
  "$ref": "#/definitions/KubeProxy",
  "$schema": "http://json-schema.org/draft-06/schema#",
  "definitions": {
    "Ipvs": {
      "additionalProperties": false,
      "properties": {
        "scheduler": {
          "type": "string"
        }
      },
      "title": "Ipvs",
      "type": "object"
    },
    "KubeProxy": {
      "additionalProperties": false,
      "properties": {
        "ipvs": {
          "$ref": "#/definitions/Ipvs"
        },
        "mode": {
          "enum": [
            "iptables",
            "ipvs"
          ],
          "type": "string"
        },
        "resources": {
          "$ref": "#/definitions/Resources"
        }
      },
      "title": "KubeProxy",
      "type": "object"
    },
    "Limits": {
      "additionalProperties": false,
      "properties": {
        "cpu": {
          "type": "string"
        },
        "memory": {
          "type": "string"
        }
      },
      "title": "Limits",
      "type": "object"
    },
    "Resources": {
      "additionalProperties": false,
      "properties": {
        "limits": {
          "$ref": "#/definitions/Limits"
        },
        "requests": {
          "$ref": "#/definitions/Limits"
        }
      },
      "title": "Resources",
      "type": "object"
    }
  }
}
z0rc commented 1 year ago

If you're running bottlerocket, wait for https://github.com/bottlerocket-os/bottlerocket/issues/2409.

bryantbiggs commented 1 year ago

This appears to be working as in intended with the new addon configuration https://github.com/clowdhaus/eks-reference-architecture/blob/ffdcf19e30c7a6177611a869914c89122f68eacd/ipvs/eks.tf#L13-L21

However, when deploying a new cluster with this enabled, I didn't see IPVS enabled on the first deploy and therefore added the user data as shown above. TBD if this is needed or if this was just a coincidence and it started working after adding it since the addition of the user data forced the node group to refresh all nodes

sriramranganathan commented 1 year ago

Amazon EKS team recently announced the general availability of advanced configuration feature for managed add-ons. You can now pass in advanced configuration for cluster add-ons, enabling you to customize add-on properties not handled by default settings. Configuration can be applied to add-ons either during cluster creation or at any time after the cluster is created.

Using advanced configuration feature, you can now configure kube-proxy add-on to use the IPVS mode instead of IPTABLES.

To learn more about this feature, check out this blogpost - https://aws.amazon.com/blogs/containers/amazon-eks-add-ons-advanced-configuration/

Check out the Amazon EKS documentation - https://docs.aws.amazon.com/eks/latest/userguide/managing-add-ons.html

Smana commented 1 year ago

This appears to be working as in intended with the new addon configuration https://github.com/clowdhaus/eks-reference-architecture/blob/ffdcf19e30c7a6177611a869914c89122f68eacd/ipvs/eks.tf#L13-L21

However, when deploying a new cluster with this enabled, I didn't see IPVS enabled on the first deploy and therefore added the user data as shown above. TBD if this is needed or if this was just a coincidence and it started working after adding it since the addition of the user data forced the node group to refresh all nodes

I just wanted to emphasize that adding the user-data commands is required. Otherwise ipvs is not enabled.

kanhayaKy commented 1 year ago

As mentioned in the article, this is already supported by replacing the following properties in each worker node's user-data script:

In my case this was not sufficient as the kube-proxy-config CM in kube-system took priority over arguments.

apiVersion: v1
data:
  config: |-
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    bindAddress: 0.0.0.0
    clientConnection:
      acceptContentTypes: ""
      burst: 10
      contentType: application/vnd.kubernetes.protobuf
      kubeconfig: /var/lib/kube-proxy/kubeconfig
      qps: 5
    clusterCIDR: ""
    configSyncPeriod: 15m0s
    conntrack:
      # max: 0 <-- I comment out this line as it causes a syntax error when the kube-proxy is loading; maybe caused by an upgrade to 1.18?
      maxPerCore: 32768
      min: 131072
      tcpCloseWaitTimeout: 1h0m0s
      tcpEstablishedTimeout: 24h0m0s
    enableProfiling: false
    healthzBindAddress: 0.0.0.0:10256
    hostnameOverride: ""
    iptables:
      masqueradeAll: false
      masqueradeBit: 14
      minSyncPeriod: 0s
      syncPeriod: 30s
    ipvs:
      excludeCIDRs: null
      minSyncPeriod: 0s
      scheduler: "lc" # <-- set to desired scheduler
      syncPeriod: 30s
    kind: KubeProxyConfiguration
    metricsBindAddress: 127.0.0.1:10249
    mode: "ipvs" # <-- modified from "iptables"
    nodePortAddresses: null
    oomScoreAdj: -998
    portRange: ""
    udpIdleTimeout: 250ms
kind: ConfigMap
metadata:
  labels:
    eks.amazonaws.com/component: kube-proxy
    k8s-app: kube-proxy
  name: kube-proxy-config
  namespace: kube-system

This solution works for us, just wanted to add a note : The file to edit here is kube-proxy-config and not kube-proxy.

calvinbui commented 1 year ago

This appears to be working as in intended with the new addon configuration https://github.com/clowdhaus/eks-reference-architecture/blob/ffdcf19e30c7a6177611a869914c89122f68eacd/ipvs/eks.tf#L13-L21 However, when deploying a new cluster with this enabled, I didn't see IPVS enabled on the first deploy and therefore added the user data as shown above. TBD if this is needed or if this was just a coincidence and it started working after adding it since the addition of the user data forced the node group to refresh all nodes

I just wanted to emphasize that adding the user-data commands is required. Otherwise ipvs is not enabled.

It appears that enabling kube-proxy enables the kernel modules itself. We run a combination of self-managed and karpenter nodes, and it was the case for both when checking with lsmod before and after restarting kube-proxy with the change. ipvsadm is also already installed in the EKS optimised AMI.

kencieszykowski commented 9 months ago

Just wanted to toss in my 2¢ here as I have been using this as a north star. We were looking to test out IPVS mode in our Stage cluster, but didn't want to modify the EKS optimized AMI. As mentioned above, the EKS optimized AMI now includes a package for ipvs, so it should work out of the box.

Additionally the add-on configuration above should also include it. As far as advanced configuration goes, we pushed the following up via Terraform:

    mode = "ipvs"
    ipvs = {
      scheduler = "lc"
    }
  }

This flipped the operating mode to IPVS, but kube-proxy reverted back to iptables because of the following issue; Can't use the IPVS proxier" err="IPVS proxier will not be used because the following required kernel modules are not loaded: [ip_vs_lc]

I wasn't too familiar with the workings of IPVS, but found the following page explaining the scheduler algorithms.

Long story short, the built-in AMI includes support for schedulers rr, wrr, and sh-- but NOT lc or any of the other fun stuff. For that, you'll need to install the additional packages onto the node as outlined above.

So now our Add-on advanced config is:

    mode = "ipvs"
    ipvs = {
      scheduler = "rr"
    }
  }

The AMI team seems hesitant to preinstall other modes, so further expansion is likely through the Userdata as outlined above. Hope this saves somebody some time!

bryantbiggs commented 9 months ago

@kencieszykowski please open a feature request on the EKS optimized AMI repo to have those added