Closed tl-alex-nicot closed 3 months ago
It shouldn't cause any functionality impact but we will clean it up, the error is from the Uber/zap logger.
Hey!
Using VPC CNI v1.16.0 I'm still facing this Issue. This makes enable-policy-event-logs pretty useless and the logging feature pretty useless. Cluster uses EKS (1.27) with managed node groups (1.27.9-20240117): any chance to get a pointer on how to fix that and get hands on policy event logs ?
@Jufik - Are you checking pod logs for policy logs when you enable - enable-policy-event-logs
? The decision/access logs will be redirected to /var/log/aws-routed-eni/network-policy-agent.log
when the knob is enabled. The same log file will also contain node agent logs..
Am i doing something wrong, because I am using 1.16.3 and still see Logger.check error: failed to get caller
logged constantly when policy event logs are enabled. I'd rather not log redundant, unimportant lines on every single pod on each node. Is is possible to change the log level so I don't have to log these to disk at all?
The problem persists, and I'm encountering it with aws-network-policy-agent:v1.0.8-eksbuild.1 and eks 1.27. Can we consider reopening the issue? @jayanthvn @jdn5126
Additionally, when I enable both the enable-cloudwatch-logs and enable-policy-event-logs parameters, the pods get stuck in a crashloopbackoff state with exit code 1, and no logs are generated.
I can confirm that this issue still persists with EKS 1.28 / aws-network-policy-agent:v1.1.0-eksbuild.1 Can this issue be /reopen 'ed?
I'm seeing a similar issue with EKS 1.29 and CNI : v1.16.2-eksbuild.1. Looks the issue needs to be reopened unless we missing something ?
{"level":"info","ts":"2024-03-26T18:30:07.320Z","caller":"runtime/asm_amd64.s:1650","msg":"version","GitVersion":"","GitCommit":"","BuildDate":""} 2024-03-26 18:30:07.336116596 +0000 UTC Logger.check error: failed to get caller
I was getting this error with EKS 1.29 and VPC CNI v1.17.1, but it went away after commenting serviceAccountArn property in my CDK resource
new eks.CfnAddon(this, "VpcCniAddon", {
clusterName: cluster.clusterName,
addonName: "vpc-cni",
addonVersion: "v1.17.1-eksbuild.1",
resolveConflicts: "PRESERVE",
// serviceAccountRoleArn: cluster.role.roleArn,
configurationValues: JSON.stringify({
env: {
ENABLE_PREFIX_DELEGATION: "true",
WARM_PREFIX_TARGET: "1",
},
}),
});
I was getting this error with EKS 1.29 and VPC CNI v1.17.1, but it went away after commenting serviceAccountArn property in my CDK resource
new eks.CfnAddon(this, "VpcCniAddon", { clusterName: cluster.clusterName, addonName: "vpc-cni", addonVersion: "v1.17.1-eksbuild.1", resolveConflicts: "PRESERVE", // serviceAccountRoleArn: cluster.role.roleArn, configurationValues: JSON.stringify({ env: { ENABLE_PREFIX_DELEGATION: "true", WARM_PREFIX_TARGET: "1", }, }), });
That potentially indicate that role missing some permissions which are present on nodegroup's role
Using VPC CNI addon 1.17.1, based on observing log data, this error seems to occur only when a policy verdict is being made. So the more verdicts you have, the more spammy this log is.
If it helps anything, our settings are:
{
"enableNetworkPolicy": "true",
"env": {
"AWS_VPC_ENI_MTU": "1480",
"AWS_VPC_K8S_CNI_LOG_FILE": "stdout",
"AWS_VPC_K8S_PLUGIN_LOG_FILE": "stderr",
"AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG": "true"
},
"nodeAgent": {
"enablePolicyEventLogs": "true",
"enableCloudWatchLogs": "true",
}
}
We also have some IRSA policy attached to deliver the cloudwatch logs (which are working, by the way -- we see policy verdicts in CW)
Fix is released with network policy agent v1.1.2 for the original issue of logs having failed to get caller... - https://github.com/aws/amazon-vpc-cni-k8s/releases/tag/v1.18.2. Please test and let us know if there are any issues.
looks like with v1.15.1 of the aws vpc cni the output for the aws-eks-nodeagent just says