Closed bhargavamin closed 2 years ago
Hi, cni is responsible only for setting up routing for pod to pod communication and also any cni setting won't impact this. This doesn't look like a CNI issue. Since you have already opened up a case we will look into why kubectl is not working for additional cidrs.
Since you already have a case opened, we will check fi the CIDR is allow-listed. Will close this issue for now.
Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.
I was able to fix the issue. Its a node AMI config issue.
When you have self managed nodes you need to explicitly mention some parameters for the /etc/eks/bootstrap.sh script so that it can support ipv6 eks cluster.
Adding bootstrap_extra_args: "--ip-family ipv6 --service-ipv6-cidr fc00::/7"
fixed the issue.
Ref: https://github.com/terraform-aws-modules/terraform-aws-eks/issues/1958
I'm setting up a dual-stack VPC with multiple CIDRS in which I created an EKS cluster using terraform module and self-managed nodes.
I'm facing a very peculiar issue where I can only do a successful
kubectl exec
andkubectl logs
commands on nodes from a single CIDR range out of 3 CIDRS attached to VPC. I suspect that this issue could be related to iptable rules or vpc cni settings.CIDR range attached with VPC:
The cluster and nodes are launched in the private subnet. Internet traffic for IPV6 traffic going EIGW and IPV4 traffic going NAT.
I have checked the following things:
Switched EKS cluster to be public as well as public and private, both gave same results.
VPC, NACL, SG, Route Table of the EKS cluster were configured properly
VPC CNI configs for ipv6 set, disabled ipv4 settings
The EKS cluster was installed by default with version 1.10.1 and we updated it to v1.11.2 and still experienced the issue
Port 10250 was listening on the nodes in the affected CIDRs, confirmed it with telnet & netstat
Telnet to port 10250 from node to node works
I confirmed that we could retrieve logs from containers running in the nodes in the affected CIDRs using "docker logs" commands
I ran kubectl command with debug mode and got the error below:
While executing the command above, we were able to see the error below in CloudWatch Logs:
Versions:
v.1.21
v1.11.2
default
default
v18.21.0
AWS Support Case ID: 10315548691
If required I can provide debug output of vpc cni troubleshooting script.