Closed mahasiva-amazon closed 8 months ago
I've faced the same issue. Documentation is not clear about this point. By looking at /var/log/aws-routed-eni/ipamd.log
on the node it seems to be an authorization issue:
{"level":"error","ts":"2023-12-06T13:51:53.616Z","caller":"ipamd/ipamd.go:457","msg":"Failed to call ec2:DescribeNetworkInterfaces for [eni-03****** eni-07********]: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: df464bdf-eb18-4b85-*******"}
{"level":"error","ts":"2023-12-06T13:51:53.727Z","caller":"aws-k8s-agent/main.go:32","msg":"Initialization failure: ipamd init: failed to retrieve attached ENIs info: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: df464bdf-****"}
I have resolved it by adding the permissions AmazonEKS_CNI_Policy
to my role
Hi @ariary Can you give more details as to how you ended up with the issue ? Which role did you ended up adding the permission to (the node role or the CNI-addon role) ? The above issue happened since the create/update addon call did not pass the service-role-arn to use for CNI
I have created a specific role with permissions for the policy I mentioned above + the one which is defined in the documentation (for cloud watch log)
For this role I check that aws-node
service account can assume it (cf trust relationship in UI)
Then you can update your add-on by specifying the adding-role arn (—service-account-role-arn
)
Note also, that to get logs you also need in your node agent configuration "enablePolicyLogs": "true"
I have created a specific role with permissions for the policy I mentioned above + the one which is defined in the documentation
So if I understand this correct.. You created a new role and added cloudwatch log policy to the role for network policy logs. CNI then complained about not having the right authorization, which is when you added the AmazonEKS_CNI_POLICY ?
Exactly
Thanks for the details.. So we do recommend to add the cloudwatch log policy to the existing CNI IAM role (which would already have the AmazonEKS_CNI_Policy
attached). This is also being called out in the prerequisites section of the docs here..
https://docs.aws.amazon.com/eks/latest/userguide/cni-network-policy.html#network-policies-troubleshooting
Add the following permissions as a stanza or separate policy to the IAM role that you are using for the VPC CNI.
Let me know if this helps
@jaydeokar indeed! Just maybe it would be helpful to specify which role we are talking about, as if we are using "default" configuration we have Service account role:Inherited from node
. Thus leading to create a new role with only the policy mentioned.
I experience the same issue, I cannot enable cloudwatch logs. The aws-node-agent falls into crash loopback.
My VPC-CNI configs
{"enableNetworkPolicy":"true","nodeAgent":{"enableCloudWatchLogs":"true"}}
My VPC-CNI version
v1.15.0-eksbuild.2
My EKS version
1.28
I tried assigning IAM permissions directly to Addon and inherited from kubernetes instances, same result. I used arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
This is the only log message I get in aws-eks-nodeagent
container
{"level":"info","ts":"2024-01-03T17:16:14Z","msg":"version","GitVersion":"","GitCommit":"","BuildDate":""}
When I manually disable cloudwatch by editing aws-node
daemonset and overwriting the cloudwatch switch it starts working
--enable-cloudwatch-logs=false
here is the generated manifest for vpc-cni-driver manifest
aws-node.yaml.txt
Hi @Mihail-blip
The accept/deny logs should be available in /aws/eks/<cluster-name>/cluster
cloudwatch. We don't log anything in the stdout for aws-eks-nodeagent container. Also make sure you have { "nodeAgent": {"enablePolicyEventLogs": "true"}
in order for the agent to start logging the accept/deny logs.
There do not seem to be any open items on this issue, so closing as resolved
@Mihail-blip , you need to include the CloudWatch permissions in your IAM role (https://docs.aws.amazon.com/eks/latest/userguide/cni-iam-role.html#cni-iam-role-create-role) or in the IAM role for EKS nodes. Additionally, make sure to configure { "nodeAgent": {"enablePolicyEventLogs": "true"} } (https://github.com/aws/aws-network-policy-agent/issues/129).
What happened:
aws eks update-addon --cluster-name ${EKS_CLUSTER_NAME} --addon-name "vpc-cni" --configuration-values '{"env":{"ENABLE_PREFIX_DELEGATION":"true", "ENABLE_POD_ENI":"true", "POD_SECURITY_GROUP_ENFORCING_MODE":"standard"},"enableNetworkPolicy": "true", "nodeAgent": { "enableCloudWatchLogs": "true", "healthProbeBindAddr": "8163", "metricsBindAddr": "8162"}}'
Attach logs
Normal Scheduled 52s default-scheduler Successfully assigned kube-system/aws-node-45nmc to ip-XXXX.us-west-2.compute.internal Normal Pulling 52s kubelet Pulling image "XXXX.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni-init:v1.14.1-eksbuild.1" Normal Pulled 49s kubelet Successfully pulled image "XXXX.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni-init:v1.14.1-eksbuild.1" in 2.696970025s (2.696982854s including waiting) Normal Created 49s kubelet Created container aws-vpc-cni-init Normal Started 49s kubelet Started container aws-vpc-cni-init Normal Pulling 48s kubelet Pulling image "XXXX.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:v1.14.1-eksbuild.1" Normal Pulled 46s kubelet Successfully pulled image "XXXX.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:v1.14.1-eksbuild.1" in 1.550764534s (1.550796824s including waiting) Normal Created 46s kubelet Created container aws-node Normal Started 46s kubelet Started container aws-node Normal Pulling 46s kubelet Pulling image "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-network-policy-agent:v1.0.2-eksbuild.1" Normal Pulled 33s kubelet Successfully pulled image "XXXX.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-network-policy-agent:v1.0.2-eksbuild.1" in 13.02422571s (13.02424398s including waiting) Normal Created 33s kubelet Created container aws-eks-nodeagent Normal Started 33s kubelet Started container aws-eks-nodeagent Warning Unhealthy 28s kubelet Readiness probe failed: {"level":"info","ts":"2023-11-21T18:27:18.910Z","caller":"/root/sdk/go1.20.4/src/runtime/proc.go:250","msg":"timeout: failed to connect service \":50051\" within 5s"} Warning Unhealthy 23s kubelet Readiness probe failed: {"level":"info","ts":"2023-11-21T18:27:23.969Z","caller":"/root/sdk/go1.20.4/src/runtime/proc.go:250","msg":"timeout: failed to connect service \":50051\" within 5s"} Warning Unhealthy 17s kubelet Readiness probe failed: {"level":"info","ts":"2023-11-21T18:27:29.021Z","caller":"/root/sdk/go1.20.4/src/runtime/proc.go:250","msg":"timeout: failed to connect service \":50051\" within 5s"} Warning Unhealthy 12s kubelet Readiness probe failed: {"level":"info","ts":"2023-11-21T18:27:34.077Z","caller":"/root/sdk/go1.20.4/src/runtime/proc.go:250","msg":"timeout: failed to connect service \":50051\" within 5s"} Warning Unhealthy 7s kubelet Readiness probe failed: {"level":"info","ts":"2023-11-21T18:27:39.591Z","caller":"/root/sdk/go1.20.4/src/runtime/proc.go:250","msg":"timeout: failed to connect service \":50051\" within 5s"} What you expected to happen:
kubectl version
): 1.27cat /etc/os-release
): Amazon Linuxuname -a
): Linux ..... 5.10.186-179.751.amzn2.x86_64 #1 SMP Tue Aug 1 20:51:38 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux