Closed rtomadpg closed 10 months ago
@rtomadpg just curious, did you notice the comment with:
For Network Policy issues, please file at https://github.com/aws/aws-network-policy-agent/issues
when you opened this issue? We are trying to improve the experience here with triaging Network Policy agent issues, so I am wondering if you think there is a better way this could have been noticed.
As for this issue, this is the same as https://github.com/aws/aws-network-policy-agent/issues/103. This error log is harmless, and a fix is in progress
Ouch, so sorry! I checked the new bug flow and indeed that comment is there. Very clearly. I guess I was too eager to file the bug (end of work day here) and I overlooked that part.
@jdn5126 maybe a suggestion: when errors are logged by a container named "aws-eks-nodeagent" it's not immediately clear that's related to "Network Policy issues" or "aws-network-policy-agent". Maybe a mention of "aws-eks-nodeagent" in that comment will reduce wrongly filed issues?
Ouch, so sorry! I checked the new bug flow and indeed that comment is there. Very clearly. I guess I was too eager to file the bug (end of work day here) and I overlooked that part.
Oh no worries, I was just curious if there was a better setup through GitHub. Good call, I can expand the comment
Hi everyone, sorry jumpin in on a closed thread.
I'm facing the same issue, but without the network policy error mentioned here. I'm tryint to upgrade a managed worker group to 1.25 but the aws-node daemonset keeps failing in aws-eks-nodeagent container, causing the pod to restart
Any ideas ? The VPC CNI plugin version is on v1.15.1-eksbuild.1
@lsabreu96 the error log from this issue is harmless. If you are seeing the aws-eks-nodeagent
container crashing, please file a new issue with the logs from the crash, which you can find in /var/log/aws-routed-eni/network-policy-agent.log
on the affected node.
For anyone reaching this thread because the aws-eks-nodeagent
container is crashing with UTC Logger.check error: failed to get caller
: For me the issue was mixing EKS k8s version 1.24 with aws-network-policy-agent:v1.0.4-eksbuild.1
and amazon-k8s-cni:v1.15.1-eksbuild.1
(These versions were automatically provisioned by EKS). Upgrading to k8s version 1.25 fixes the container crashing loop, as mentioned on the README of this repo (You’ll need a Kubernetes cluster version 1.25+ to run against.
).
So not commenting to reopen this issue, just provide information if anyone still running 1.24 lands here!
What happened:
After upgrading VPC-CNI from
v1.14.1-eksbuild.1
tov1.15.4-eksbuild.1
all theaws-eks-nodeagent
containers logged:And, when I delete a random aws-node pod, I see this:
I believe these errors comes from the
uber-go/zap
dependency, see https://github.com/uber-go/zap/blob/5acd569b6a5264d4c7433cbb278a8336d491715c/logger.go#L398As I am unsure this error is signalling something is (really) wrong and this error was not logged in this project yet, I created the bug.
Attach logs
Let me know if needed.
What you expected to happen:
No errors getting logged.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
kubectl version
): v1.27.7-eks-4f4795dcat /etc/os-release
): Amazon Linux 2uname -a
):