Open luisamador opened 3 years ago
i'm also running into this issue as-well
The plugin used to only set the log retention on new log groups. This means if you have run the same config before then the log group might already exist, and the plugin will not update the retention.
We updated this recently and released it in AWS for Fluent Bit 2.10.0 for the cloudwatch
plugin: https://github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/issues/121
I also have the same issue for both existing and new log groups.
In my case it was missing action in AWS iam policy used by FB pods. "logs:putRetentionPolicy" solved problem
I have the same problem in version 2.21.4
. Retention for new and existing log groups is always set to Never.
The problem was indeed the missing logs:putRetentionPolicy
permission. I use EKSCTL to manage my EKS cluster and all my nodes have this IAM (ref.: https://eksctl.io/usage/iam-policies/#supported-iam-add-on-policies):
nodeGroups:
- ...
iam:
withAddonPolicies:
cloudWatch: true
In practice, nodes have this policy: arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy
. It contains the following:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"cloudwatch:PutMetricData",
"ec2:DescribeVolumes",
"ec2:DescribeTags",
"logs:PutLogEvents",
"logs:DescribeLogStreams",
"logs:DescribeLogGroups",
"logs:CreateLogStream",
"logs:CreateLogGroup"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ssm:GetParameter"
],
"Resource": "arn:aws:ssm:*:*:parameter/AmazonCloudWatch-*"
}
]
}
And that's the problem, these permissions are insufficient.
@illagrenan, exactly. The AWS documentation needs to be updated.
Have been seeing similar issue where "log_retention_days" was not being set on our "additionalOutputs" and stayed at "Never expire",
Versions,
Extract from aws-for-fluentbit-values.yaml
***etc***
additionalOutputs: |
[OUTPUT]
Name cloudWatch
Enabled true
Match ebs-csi.*
Region ap-southeast-2
Log_Group_Name /aws/eks/container-workload/xxxxx-ebs-csi
Log_Stream_Prefix fluentbit-
Log_Retention_Days 14
Auto_Create_Group true
[OUTPUT]
Name cloudWatch
Enabled true
Match xxxxx-sm.*
Region ap-southeast-2
Log_Group_Name /aws/eks/container-workload/xxxxx-xxxxx-sm
Log_Stream_Prefix fluentbit-
Log_Retention_Days 14
Auto_Create_Group true
***etc***
After reading the previous posts I observed that if the missing permission "logs:PutRetentionPolicy" is manually added (as not there by default) and I rerun the pipeline the permission is removed again, this should be added to the permanent list.
Error from the logs when trying to set log_retention_days,
time="2022-06-08T05:27:14Z" level=error msg="AccessDeniedException: User: arn:aws:sts::************:assumed-role/container-workload-aws-for-fluent-bit-sa-irsa/1654666034225831554 is not authorized to perform: logs:PutRetentionPolicy on resource: arn:aws:logs:ap-southeast-2:************:log-group:/aws/eks/container-workload/*****-cert-manager:log-stream: because no identity-based policy allows the logs:PutRetentionPolicy action\n\tstatus code: 400, request id: c2209996-****-4ae8-*****-03d4272f16f6" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:340"
Manually added the missing permission back, deleted the loggroups so they would be forced to recreate, restarted the daemonset for fluentbit which recreates the loggroups and the log retention is set correctly,
Describe the bug The option "cloudWatch.logRetentionDays" doesn't set the log retention days setting of the resulting CloudWatch log group.
Steps to reproduce
Expected outcome The resulting log group should have a retention policy of 3 days. However it is set with a "Never expire" retention policy.
Environment