aws / amazon-vpc-cni-k8s

Networking plugin repository for pod networking in Kubernetes using Elastic Network Interfaces on AWS
Apache License 2.0
2.25k stars 733 forks source link

Can VPC-CNI with Security Group for Pods work with kube-proxy in IPVS mode? #2982

Open AhmadMS1988 opened 2 months ago

AhmadMS1988 commented 2 months ago

What happened: We need to use kube-proxy in IPVS mode, with VPC CNI because of our increased numbers of pods, as long as to achieve between request load balancing using k8s services.

We noticed that pods that has security groups and branch network interfaces, the both ingress and egress traffic of these pods stops and never comeback until we go back to iptables mode and refresh all nodes.

We need to know if VPC CNI supports both security groups for pods and kube-proxy in IPVS mode.

Environment:

Tested on both arm64 and amd64

orsenthil commented 1 month ago

VPC-CNI with Security Group for Pods work with kube-proxy in IPVS

Yes,there is no limitation that we know that will not work. Kube-Proxy is service proxy later; and it's mode of ipvs or iptables shouldn't interfere with SGPP functionality of pods.

I will setup a IPVS cluster and verify and share my output.

yash97 commented 1 month ago

Hi @AhmadMS1988,

I set up the cluster and switched kube-proxy to IPVS mode. I also created an Nginx service with a security group and tested in-cluster pod-to-pod connectivity to Nginx, which worked fine. Could you please provide more details on the specific networking test that is failing? TIA!

AhmadMS1988 commented 1 month ago

Hi all; @orsenthil please inform me with your findings. @yash97 are you sure that you have replaced the worker nodes after switching to ipvs mode in kube-proxy? I open a ticket with AWS EKS team and they confirmed that they can replicate the situation. Thanks all

orsenthil commented 1 month ago

@AhmadMS1988 - I was able to verify SGP working with kube-proxy in IPVS mode.

Followed these public docs.

  1. https://docs.aws.amazon.com/eks/latest/userguide/security-groups-for-pods.html - Created my-app, my-deployment exactly as specified. Manually created the security group with egress/ingress ALL from anywhere for testing purposes. Associated the security group and ensured new pods had the security associated. 2.Created a curl client to test connectivity with my-app
apiVersion: v1
kind: Pod
metadata:
  name: curl-pod
  namespace: my-namespace
spec:
  containers:
  - name: curl-container
    image: curlimages/curl
    command: ['sh', '-c', 'while true; do sleep 30; done;']
  1. Verified connect to the service with kube-proxy in iptables with SGP is working fine.
    
    kubectl exec -it curl-pod -n my-namespace -- sh
    ~ $ curl my-app
    <!DOCTYPE html>
    <html>
    <head>
    <title>Welcome to nginx!</title>
    <style>
    html { color-scheme: light dark; }
    body { width: 35em; margin: 0 auto;
    font-family: Tahoma, Verdana, Arial, sans-serif; }
    </style>
    </head>
    <body>
    <h1>Welcome to nginx!</h1>
    <p>If you see this page, the nginx web server is successfully installed and
    working. Further configuration is required.</p>

For online documentation and support please refer to nginx.org.
Commercial support is available at nginx.com.

Thank you for using nginx.


### Switched to IPVS Mode.

1.  https://repost.aws/knowledge-center/eks-configure-ipvs-kube-proxy

2. Edited kube-proxy to ipvs mode with rr scheduling - kubectl edit cm kube-proxy-config -n kube-system

3. Scaled up and down my node-group for ipvs changes to take effect.

4.     Verified my service is backed by pod ips in the IPVS mode

sudo ipvsadm -L

IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP ip-10-100-0-1.us-west-2.comp rr -> ip-192-168-157-183.us-west-2 Masq 1 1 0 -> ip-192-168-172-5.us-west-2.c Masq 1 2 0 TCP ip-10-100-0-10.us-west-2.com rr -> ip-192-168-8-160.us-west-2.c Masq 1 0 0 -> ip-192-168-64-32.us-west-2.c Masq 1 0 0 TCP ip-10-100-166-45.us-west-2.c rr -> ip-192-168-20-65.us-west-2.c Masq 1 0 0 -> ip-192-168-30-168.us-west-2. Masq 1 0 0 -> ip-192-168-72-39.us-west-2.c Masq 1 0 0 -> ip-192-168-78-170.us-west-2. Masq 1 0 0 UDP ip-10-100-0-10.us-west-2.com rr -> ip-192-168-8-160.us-west-2.c Masq 1 0 0 -> ip-192-168-64-32.us-west-2.c Masq 1 0 0


    Note the my-app service and my-deployment pods (match with the ipvs in the ipvsadm -L)

[senthilx@dev-dsk-senthilx-2a-b140a96c ~]$ kubectl get svc -o wide -A NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR default kubernetes ClusterIP 10.100.0.1 443/TCP 9d kube-system kube-dns ClusterIP 10.100.0.10 53/UDP,53/TCP 9d k8s-app=kube-dns my-namespace my-app ClusterIP 10.100.166.45 80/TCP 21m app=my-app

[senthilx@dev-dsk-senthilx-2a-b140a96c ~]$ kubectl get pods -o wide -n my-namespace NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES curl-pod 1/1 Running 0 2m58s 192.168.64.33 ip-192-168-78-86.us-west-2.compute.internal my-deployment-5bf4d6bfff-4q7s9 1/1 Running 0 11m 192.168.72.39 ip-192-168-78-86.us-west-2.compute.internal my-deployment-5bf4d6bfff-f4trw 1/1 Running 0 11m 192.168.30.168 ip-192-168-6-104.us-west-2.compute.internal my-deployment-5bf4d6bfff-s7v47 1/1 Running 0 11m 192.168.78.170 ip-192-168-78-86.us-west-2.compute.internal my-deployment-5bf4d6bfff-x4rlx 1/1 Running 0 11m 192.168.20.65 ip-192-168-6-104.us-west-2.compute.internal


5. Ensured that test connectivity is successful.

$ kubectl exec -it curl-pod -n my-namespace -- sh ~ $ curl my-app <!DOCTYPE html>

Welcome to nginx!

Welcome to nginx!

If you see this page, the nginx web server is successfully installed and working. Further configuration is required.

For online documentation and support please refer to nginx.org.
Commercial support is available at nginx.com.

Thank you for using nginx.

AhmadMS1988 commented 1 month ago

Hi @orsenthil I confirmed the behaviour you see for the inbound requests. Can you exec info one of my-deployment pods and do any outbound curl? for example, google.com? or if you want try to give curl-pod a security group try to curl the service again will it work in IPVS mode? If it did not work in IPVS, can you disable it and refresh the instances and try the same setup again? Please also note that I have confirmed the behaviour on AL2023 and Bottlerocket and I do no know if it work on AL2 or Ubuntu. Wanted to mention that AWS support replicated the behaviour, so if you want and based on your availability we can jump into a call with them to discuss it. Thanks

yash97 commented 1 month ago

Hi @AhmadMS1988 I did reproduce your issue. Yes it indeed does not work. Here's Why: When you create pod with security group it will send traffic from eth1 interface. It will also use separate routing table. You can execute ip rule list to verfiy. Now for ipvs to work, packet has to pass through kube-ipvs0 interface device. You can see routes for that do ip route show table all to see all routes on node. So packet from pod with security group does go outside of node. But is will not get DNATed. So at parent eni when packet will arrive it will have dst ip as service ip instead of endpoint ip.

AhmadMS1988 commented 1 month ago

Thank you. So can this be fixed from CNI level? Or is this a limitation that security groups for pods can't work with ipvs?

orsenthil commented 1 month ago

@AhmadMS1988 - Confirming the observation from @yash97 . This seems problematic only with IPVS mode. That is, Pods with SGP only on IPVS does not seem connect to Service IP. Trying to see if a fix can be made a a CNI level.

orsenthil commented 1 month ago

@AhmadMS1988 - We are going to call this out as a limitation for now. We had an known issue with Pods with Security Group not working IPVS mode for sometime now.