Closed dawidmalina closed 1 year ago
Any update on this one?
Hi - we haven't adopted this in EKS as it's GA but not the default for Kubernetes yet. We are researching being able to enable this with EKS.
I'd like to have this sure, but I'd prefer to wait until it is the k8s default. I guess it could be an experimental option, but I am not sure AWS want to support both modes 😄
Even in k8s 1.14 IPVS is not the default (I think?) and the people are still smoothing over the rough edges of IPVS for the things that it breaks or that it don't support, e.g.
For what it's worth, we have been running IPVS on large clusters (managed by ourselves) for about a year without problem except for graceful termination: we kept using a version without it because it was a bit unstable. We have fixed a few issues with graceful termination in the last few months and it seems almost ready now.
A few important PRs haven't been merged yet but should be soon:
If you test it and find any issue, let us know
Not sure if this should be mentioned here or if it should comprise a separate request - would it make sense to optionally allow disabling the deployment of kube-proxy altogether?
The justification would be that kube-proxy itself is not a strict requirement with certain configurations. It would mirror the steps kubeadm took to make kube-proxy's deployment optional (for similar reasons).
@arzarif that's a topic I think that warrants a separate thread. Can you open a new issue?
Hi, Could you please let me know if EKS is allowing to switch from iptables to ipvs as per this blog or work is still in progress?
https://medium.com/@jeremy.i.cowan/the-problem-with-kube-proxy-enabling-ipvs-on-eks-169ac22e237e
Hi, is this coming anytime soon?
Nice blog @SankarGopal77 !
Hi, Is there any progress
+1 on this issue
Hi, Could you please let me know if EKS is allowing to switch from iptables to ipvs as per this blog or work is still in progress?
https://medium.com/@jeremy.i.cowan/the-problem-with-kube-proxy-enabling-ipvs-on-eks-169ac22e237e
As mentioned in the article, this is already supported by replacing the following properties in each worker node's user-data script:
#cloud-config
packages:
- ipvsadm
runcmd:
- sudo modprobe ip_vs
- sudo modprobe ip_vs_rr
- sudo modprobe ip_vs_wrr
- sudo modprobe ip_vs_sh
- sudo modprobe nf_conntrack_ipv4
- /var/lib/cloud/scripts/per-instance/bootstrap.al2.sh
And in the kube-proxy
deamonset:
containers:
- command:
- /bin/sh
- -c
- kube-proxy --v=2 --kubeconfig=/var/lib/kube-proxy/kubeconfig --proxy-mode=ipvs --ipvs-scheduler=sed
image: 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/kube-proxy:v1.13.8
I believe the opened issue is suggesting IPVS as default, as opposed to IP tables as these can cause contention on large-scale production systems.
W e are waiting IPVS. It's quire clear that is so important but we didnot get any reply or support from EKS Team. It's so sad for your customer.
Is here any progress? We also need to use IPVS.
FYI: In our case, we enable IPVS modules directly via UserData setting on top the latest AMI.
I've submitted this to the EKS AMI team. If it can be included in the base EKS AMI it'd be a matter of enabling it.
As mentioned in the article, this is already supported by replacing the following properties in each worker node's user-data script:
In my case this was not sufficient as the kube-proxy-config
CM in kube-system
took priority over arguments.
apiVersion: v1
data:
config: |-
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
clientConnection:
acceptContentTypes: ""
burst: 10
contentType: application/vnd.kubernetes.protobuf
kubeconfig: /var/lib/kube-proxy/kubeconfig
qps: 5
clusterCIDR: ""
configSyncPeriod: 15m0s
conntrack:
# max: 0 <-- I comment out this line as it causes a syntax error when the kube-proxy is loading; maybe caused by an upgrade to 1.18?
maxPerCore: 32768
min: 131072
tcpCloseWaitTimeout: 1h0m0s
tcpEstablishedTimeout: 24h0m0s
enableProfiling: false
healthzBindAddress: 0.0.0.0:10256
hostnameOverride: ""
iptables:
masqueradeAll: false
masqueradeBit: 14
minSyncPeriod: 0s
syncPeriod: 30s
ipvs:
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: "lc" # <-- set to desired scheduler
syncPeriod: 30s
kind: KubeProxyConfiguration
metricsBindAddress: 127.0.0.1:10249
mode: "ipvs" # <-- modified from "iptables"
nodePortAddresses: null
oomScoreAdj: -998
portRange: ""
udpIdleTimeout: 250ms
kind: ConfigMap
metadata:
labels:
eks.amazonaws.com/component: kube-proxy
k8s-app: kube-proxy
name: kube-proxy-config
namespace: kube-system
Do we need to run kube-proxy --cleanup
before we switch from iptables to ipvs?
Do we need to run
kube-proxy --cleanup
before we switch from iptables to ipvs?
Yes one of the blogs above mentions needing to run that or restart workers if you switch from default to IPVS. link
@tabern is there any official progress on this?
Note for using sed
as ipvs scheduler you need to active the module:
sudo modprobe ip_vs_sed
just curious, based on AWS EKS customer base, why this is no fixed yet, maybe their EKS use cases are quite small?
in my case, I updte the kube-proxy-config configmap mode: "ipvs" , after about 20mins my kube-proxy-config configmap would be rollback by kube-proxy addon. which could cause weird problem. some of kube-proxy in iptables mode
in my case, I updte the kube-proxy-config configmap mode: "ipvs" , after about 20mins my kube-proxy-config configmap would be rollback by kube-proxy addon. which could cause weird problem. some of kube-proxy in iptables mode
Sadly I've got the exact same behavior. If you figure out why this is happening me (and probably others) would be interested in the solution :-)
Are you using managed or self-managed addons?
Are you using managed or self-managed addons?
Thanks you very much for the hint! I actually tried with the managed "kube-proxy" add-on of eks.
When not using the addon I could make the ipvs mode work and my changes did not get reverted *hooray*.
To make it run in that case however I also needed to load the necessary kernel modules on my own. (As described in the previously linked medium.com article; some package names changed since then):
sudo modprobe ip_vs
sudo modprobe ip_vs_rr
sudo modprobe ip_vs_wrr
sudo modprobe ip_vs_sh
sudo modprobe ip_vs_lc
sudo modprobe nf_conntrack
For anyone coming to this (and/or the afore-mentioned articles above) like I did many times before getting it working (some steps in different articles were unnecessary for me, but thank you to everyone for the articles and comments that helped me ultimately get it working!)...
...here are the steps as of the date of this comment that served to get ipvs-lc working for us, with our EKS managed node groups. Also, not sure it matters, but I used eksctl to provision my clusters initially. I am also using arm (Graviton 2) instances, with the default AMI.
In the eks console, click your cluster name, then click the Compute tab, click your nodegroup name (nodegroup-1 for me)->click the launch template name under "launch template" e.g. your-cluster-name-nodegroup-nodegroup-1) - then modify it (click actions -> modify template (create a new version)). Keep everything that is in the template already except for adding a description e.g. "install ipvs-lc dependencies", and adding the following to the "user data" section (under Advanced options) - note that since this is for managed node groups, the user data has to be in MIME format, as below (this tripped me up for a few)!
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="
--==MYBOUNDARY==
Content-Type: text/x-shellscript; charset="us-ascii"
#!/bin/bash
yum install -y ipvsadm
ipvsadm -l
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
modprobe ip_vs_lc
modprobe nf_conntrack
--==MYBOUNDARY==--
Back in the eks console, (go to aws eks->click cluster name->click Compute tab) - you'll now see (if you couldn't before) under Node Groups that you can click "change version" next to the launch template name. Click to change the version to the version you just created and choose the default "rolling update" to apply the new launch template version in a no-downtime fashion. It takes a bit... maybe 10 min or so? Depends on the number of nodes. You can continually run kubectl get nodes
in the cli to watch for progress.
When the above is finished, run:
kubectl -n kube-system edit cm kube-proxy-config
Do the following:
mode
from iptables
to ipvs
scheduler
from ""
to "lc"
Run:
kubectl -n kube-system edit ds kube-proxy
==> Change from
containers:
- command:
- kube-proxy
- --v=2
- --config=/var/lib/kube-proxy-config/config
==> To
containers:
- command:
- kube-proxy
- --v=2
- --proxy-mode=ipvs
- --ipvs-scheduler=lc
- --config=/var/lib/kube-proxy-config/config
env:
- name: KUBE_PROXY_MODE
value: ipvs
Ssh to one of the worker nodes (click through to an ec2 instance via one of your nodes in the nodegroup you just updated, get the ip address, and ssh ec2-user@the-nodes-ip-address
) and once connected, run command:
$ sudo ipvsadm -l
Output should look like:
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP ip-172-2-0-10.us-east-2.com lc
-> ip-10-2-2-4.us-east-2.com Masq 1 0 0
-> ip-10-8-9-0.us-east-2.comp Masq 1 0 0
TCP ip-172-2-32-23.us-east-2.c lc
-> ip-10-1-2-3.us-east-2.comp Masq 1 0 0
-> ip-10-5-6-7.us-east-2.com Masq 1 0 0
TCP ip-172-20-6-7.us-east-2.co lc
-> ip-10-8-7-6.us-east-2.com Masq 1 0 0
-> ip-10-8-6-5.us-east-2.comp Masq 1 0 0
...
Hi,
Has anyone tried enabling IPVS for EKS with Fargate nodes?
With release of addon configuration (https://aws.amazon.com/about-aws/whats-new/2022/12/eks-add-ons-supports-advanced-configuration/) I see that kube-proxy addon can be configured natively to use ipvs. Haven't tested it though.
Current config validation schema:
{
"$ref": "#/definitions/KubeProxy",
"$schema": "http://json-schema.org/draft-06/schema#",
"definitions": {
"Ipvs": {
"additionalProperties": false,
"properties": {
"scheduler": {
"type": "string"
}
},
"title": "Ipvs",
"type": "object"
},
"KubeProxy": {
"additionalProperties": false,
"properties": {
"ipvs": {
"$ref": "#/definitions/Ipvs"
},
"mode": {
"enum": [
"iptables",
"ipvs"
],
"type": "string"
},
"resources": {
"$ref": "#/definitions/Resources"
}
},
"title": "KubeProxy",
"type": "object"
},
"Limits": {
"additionalProperties": false,
"properties": {
"cpu": {
"type": "string"
},
"memory": {
"type": "string"
}
},
"title": "Limits",
"type": "object"
},
"Resources": {
"additionalProperties": false,
"properties": {
"limits": {
"$ref": "#/definitions/Limits"
},
"requests": {
"$ref": "#/definitions/Limits"
}
},
"title": "Resources",
"type": "object"
}
}
}
If you're running bottlerocket, wait for https://github.com/bottlerocket-os/bottlerocket/issues/2409.
This appears to be working as in intended with the new addon configuration https://github.com/clowdhaus/eks-reference-architecture/blob/ffdcf19e30c7a6177611a869914c89122f68eacd/ipvs/eks.tf#L13-L21
However, when deploying a new cluster with this enabled, I didn't see IPVS enabled on the first deploy and therefore added the user data as shown above. TBD if this is needed or if this was just a coincidence and it started working after adding it since the addition of the user data forced the node group to refresh all nodes
Amazon EKS team recently announced the general availability of advanced configuration feature for managed add-ons. You can now pass in advanced configuration for cluster add-ons, enabling you to customize add-on properties not handled by default settings. Configuration can be applied to add-ons either during cluster creation or at any time after the cluster is created.
Using advanced configuration feature, you can now configure kube-proxy add-on to use the IPVS mode instead of IPTABLES.
To learn more about this feature, check out this blogpost - https://aws.amazon.com/blogs/containers/amazon-eks-add-ons-advanced-configuration/
Check out the Amazon EKS documentation - https://docs.aws.amazon.com/eks/latest/userguide/managing-add-ons.html
This appears to be working as in intended with the new addon configuration https://github.com/clowdhaus/eks-reference-architecture/blob/ffdcf19e30c7a6177611a869914c89122f68eacd/ipvs/eks.tf#L13-L21
However, when deploying a new cluster with this enabled, I didn't see IPVS enabled on the first deploy and therefore added the user data as shown above. TBD if this is needed or if this was just a coincidence and it started working after adding it since the addition of the user data forced the node group to refresh all nodes
I just wanted to emphasize that adding the user-data commands is required. Otherwise ipvs is not enabled.
As mentioned in the article, this is already supported by replacing the following properties in each worker node's user-data script:
In my case this was not sufficient as the
kube-proxy-config
CM inkube-system
took priority over arguments.apiVersion: v1 data: config: |- apiVersion: kubeproxy.config.k8s.io/v1alpha1 bindAddress: 0.0.0.0 clientConnection: acceptContentTypes: "" burst: 10 contentType: application/vnd.kubernetes.protobuf kubeconfig: /var/lib/kube-proxy/kubeconfig qps: 5 clusterCIDR: "" configSyncPeriod: 15m0s conntrack: # max: 0 <-- I comment out this line as it causes a syntax error when the kube-proxy is loading; maybe caused by an upgrade to 1.18? maxPerCore: 32768 min: 131072 tcpCloseWaitTimeout: 1h0m0s tcpEstablishedTimeout: 24h0m0s enableProfiling: false healthzBindAddress: 0.0.0.0:10256 hostnameOverride: "" iptables: masqueradeAll: false masqueradeBit: 14 minSyncPeriod: 0s syncPeriod: 30s ipvs: excludeCIDRs: null minSyncPeriod: 0s scheduler: "lc" # <-- set to desired scheduler syncPeriod: 30s kind: KubeProxyConfiguration metricsBindAddress: 127.0.0.1:10249 mode: "ipvs" # <-- modified from "iptables" nodePortAddresses: null oomScoreAdj: -998 portRange: "" udpIdleTimeout: 250ms kind: ConfigMap metadata: labels: eks.amazonaws.com/component: kube-proxy k8s-app: kube-proxy name: kube-proxy-config namespace: kube-system
This solution works for us, just wanted to add a note : The file to edit here is kube-proxy-config and not kube-proxy.
This appears to be working as in intended with the new addon configuration https://github.com/clowdhaus/eks-reference-architecture/blob/ffdcf19e30c7a6177611a869914c89122f68eacd/ipvs/eks.tf#L13-L21 However, when deploying a new cluster with this enabled, I didn't see IPVS enabled on the first deploy and therefore added the user data as shown above. TBD if this is needed or if this was just a coincidence and it started working after adding it since the addition of the user data forced the node group to refresh all nodes
I just wanted to emphasize that adding the user-data commands is required. Otherwise ipvs is not enabled.
It appears that enabling kube-proxy enables the kernel modules itself. We run a combination of self-managed and karpenter nodes, and it was the case for both when checking with lsmod
before and after restarting kube-proxy with the change. ipvsadm
is also already installed in the EKS optimised AMI.
Just wanted to toss in my 2¢ here as I have been using this as a north star. We were looking to test out IPVS mode in our Stage cluster, but didn't want to modify the EKS optimized AMI. As mentioned above, the EKS optimized AMI now includes a package for ipvs
, so it should work out of the box.
Additionally the add-on configuration above should also include it. As far as advanced configuration goes, we pushed the following up via Terraform:
mode = "ipvs"
ipvs = {
scheduler = "lc"
}
}
This flipped the operating mode to IPVS, but kube-proxy
reverted back to iptables
because of the following issue; Can't use the IPVS proxier" err="IPVS proxier will not be used because the following required kernel modules are not loaded: [ip_vs_lc]
I wasn't too familiar with the workings of IPVS, but found the following page explaining the scheduler algorithms.
Long story short, the built-in AMI includes support for schedulers rr
, wrr
, and sh
-- but NOT lc
or any of the other fun stuff. For that, you'll need to install the additional packages onto the node as outlined above.
So now our Add-on advanced config is:
mode = "ipvs"
ipvs = {
scheduler = "rr"
}
}
The AMI team seems hesitant to preinstall other modes, so further expansion is likely through the Userdata as outlined above. Hope this saves somebody some time!
@kencieszykowski please open a feature request on the EKS optimized AMI repo to have those added
Tell us about your request It would be nice to allow to switch from default
iptables
toipvs
kube-proxy mode.Which service(s) is this request for? EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? Better load balancing across all pods covered by particular service.