Open mukshe01 opened 7 months ago
in ConfigMap aws-for-fluent-bit
I had to add auto_create_group true
to the bottom, restart the pod, then it worked
[OUTPUT]
Name cloudwatch_logs
Match *
region ap-northeast-1
log_group_name /aws/eks/ca-prod/aws-fluentbit-logs
log_stream_prefix fluentbit-
Have you inspected the fluent-bit containers for their use of CPU or considered increasing the resource settings in the chart?
Hi Team,
We are running fluentbit to push application logs from our kubernates cluster(eks cluster with ec2 machines as k8s nodes) to cloudwatch, recently we observed some log entries are missing in cloudwatch when system is on high load.
below is fluentbit config:
fluent-bit.conf: [SERVICE] HTTP_Server On HTTP_Listen 0.0.0.0 HTTP_PORT 2020 Health_Check On HC_Errors_Count 5 HC_Retry_Failure_Count 5 HC_Period 5
Parsers_File /fluent-bit/parsers/parsers.conf [INPUT] Name tail Tag kube. Path /var/log/containers/.log DB /var/log/flb_kube.db Parser docker Docker_Mode On Mem_Buf_Limit 5MB Skip_Long_Lines On Refresh_Interval 10 [FILTER] Name kubernetes Match kube. Kube_URL https://kubernetes.default.svc.cluster.local:443 Merge_Log On Merge_Log_Key data Keep_Log On K8S-Logging.Parser On K8S-Logging.Exclude On Buffer_Size 2048k [OUTPUT] Name cloudwatch_logs Match region us-east-1 log_group_name /aws/containerinsights/one-source-qa-n5p1P1d1/application-new log_stream_prefix fluentbit- log_stream_template $kubernetes['namespace_name'].$kubernetes['container_name'] auto_create_group true
we installed fluentbit in our k8s cluster using helm chart. https://github.com/aws/eks-charts/tree/master/stable/aws-for-fluent-bit fluentbit appVersion: 2.31.11 helm chart version: 0.1.28
we are seeing two types of errors in fluentbit log
2024-03-19T11:32:29.235887301Z stderr F [2024/03/19 11:32:29] [ info] [input:tail:tail.0] inode=26222862 handle rotation(): /var/log/containers/rest-api-qa-954d864f9-smkv5_participant1-qa_rest-api-c5dac2e01 1fe0f093560b815135fff49dfade0835e22fd71c88aed4fa4d86439.log => /var/log/pods/participant1-qa_rest-api-qa-954d864f9-smkv5_319b4e14-e50c-44c6-86ff-558547bbcb3c/rest-api/0.log.20240319-113228 2024-03-19T11:32:29.488386964Z stderr F [2024/03/19 11:32:29] [ info] [input] tail.0 resume (mem buf overlimit) 2024-03-19T11:32:49.909327531Z stderr F [2024/03/19 11:32:49] [ info] [input] tail.0 resume (mem buf overlimit) 2024-03-19T11:32:49.911154349Z stderr F [2024/03/19 11:32:49] [error] [plugins/in_tail/tail_file.c:1432 errno=2] No such file or directory 2024-03-19T11:32:49.911160979Z stderr F [2024/03/19 11:32:49] [error] [plugins/in_tail/tail_fs_inotify.c:147 errno=2] No such file or directory 2024-03-19T11:32:49.911163819Z stderr F [2024/03/19 11:32:49] [error] [input:tail:tail.0] inode=26222863 cannot register file /var/log/containers/rest-api-qa-954d864f9-smkv5_participant1-qa_rest-api-c5dac2e011fe0f093560b815135fff49dfade0835e22fd71c88aed4fa4d86439.log
also many occurances of this(our mem buffer config is Mem_Buf_Limit) when system is on high load:
2024-03-20T13:29:12.624465969Z stderr F [2024/03/20 13:29:12] [ warn] [input] tail.0 paused (mem buf overlimit) 2024-03-20T13:29:12.915368764Z stderr F [2024/03/20 13:29:12] [ info] [input] tail.0 resume (mem buf overlimit) 2024-03-20T13:29:12.923306843Z stderr F [2024/03/20 13:29:12] [ warn] [input] tail.0 paused (mem buf overlimit) 2024-03-20T13:29:12.954591621Z stderr F [2024/03/20 13:29:12] [ info] [input] tail.0 resume (mem buf overlimit) 2024-03-20T13:29:12.956495689Z stderr F [2024/03/20 13:29:12] [ warn] [input] tail.0 paused (mem buf overlimit) 2024-03-20T13:29:13.527593998Z stderr F [2024/03/20 13:29:13] [ info] [input] tail.0 resume (mem buf overlimit)
fyi: kubernates rotates container logs when it gets 10 MB, when system runs high load the log rotation is very frequent.
would you check our config and let us know how we can avoid missing logs in cloudwatch?. please let us know if you need anymore info from us.
Regards Shekhar