Open ArtemProskochylo opened 5 months ago
@ArtemProskochylo How did you upgrade the VPC CNI version? It appears that you're missing the required permissions for the aws-node
pod. Did you apply the corresponding version specific manifest?
Facing the same issue after upgrading to EKS 1.29 with CNI 1.18.0. @achevuru I upgraded the addon directly from AWS using Terraform. I checked the ClusterRole configuration and it has the permissions you referred to:
Seems like a bug.
@danielap-ma If you're seeing the same error as above - then either the permissions are missing (please check if CNI pods have correct SA in place) or there are connectivity issues with your API Server. I quickly tried it and I don't see any such issue(s) on my end.
@ArtemProskochylo How did you upgrade the VPC CNI version? It appears that you're missing the required permissions for the
aws-node
pod. Did you apply the corresponding version specific manifest?
Hi @achevuru Sorry for the late response. It was also updated through Terraform. But in my case only add-on version was set through Terraform, configmaps, daemonset and other resources are managed by AWS. I have checked RBACs for vpc-cni v1.17.1 and required permissions are presented there: `- apiGroups:
But I still see the following error in logs for v1.17.1: W0509 03:34:41.481449 1 reflector.go:462] pkg/mod/k8s.io/client-go@v0.29.1/tools/cache/reflector.go:229: watch of *v1alpha1.PolicyEndpoint ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
In another cluster running the updated version v1.18.1, I do not see those errors. I suppose it is a version-specific issue.
I hope provided info will be useful for you.
Thanks
In another cluster running the updated version v1.18.1, I do not see those errors. I suppose it is a version-specific issue.
Hey @achevuru, Working with @danielap-ma on this issue. We still see these errors even though the CNI pods have the right SA, as Daniel wrote in the above comment. Anything we can do to overcome these errors?
Hi @omfurman-ma @danielap-ma , can you please ensure you have eks:addon-cluster-admin
ClusterRoleBinding deployed into your cluster? if not, please follow solution provided on https://repost.aws/questions/QUEAwOTFmCTLG-SzJQOhkx3w/accessdenied-when-create-ebs-csi-driver
What happened: After upgrading vpc-cni plugin to v1.17.1 and v1.18.0 versions I see a lot of errors for the aws-network-policy-agent container with v1.1.0 version. The issue is occurring even on fresh EKS installations where we are not using Network Policies.
Attach logs
W0424 08:27:34.397257 1 reflector.go:462] pkg/mod/k8s.io/client-go@v0.29.1/tools/cache/reflector.go:229: watch of *v1alpha1.PolicyEndpoint ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
What you expected to happen: No error messages.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
kubectl version
): Client Version: v1.29.1 Server Version: v1.29.1-eks-b9c9ed7cat /etc/os-release
): Bottlerocket OS 1.19.2 (aws-k8s-1.29)uname -a
): 6.1.77