aws-samples / amazon-cloudwatch-container-insights

CloudWatch Agent Dockerfile and K8s YAML templates for CloudWatch Container Insights.
MIT No Attribution
162 stars 106 forks source link

reduce default log level from info to error to reduce unnecessary log… #155

Closed chadpatel closed 8 months ago

chadpatel commented 8 months ago

…s being published

Description of the issue

Fluent bit default log level is set to info, which is pretty chatty. This is causing customers to pay unnecessarily large log bills, particularly when their logs are not in the "kubernetes" format which results in this log message:

[2022/06/30 06:09:29] [ warn] [record accessor] translation failed, root key=kubernetes

https://docs.fluentbit.io/manual/v/1.9-pre/pipeline/outputs/cloudwatch

If the kubernetes structure is not found in the log record, then the log_group_name and log_stream_prefix will be used instead, and Fluent Bit will log an error like:
...

Other customers have reported a large amount of this unnecessary warning

2023-09-21T20:14:51.860592207Z stderr F [2023/09/21 20:14:51] [ warn] [parser:_ml_cri] invalid time format %Y-%m-%dT%H:%M:%S.%L%z for '2023-09-21T20:14:51.860487043Z stderr F [2023/09/21 20:14:51] [ warn] [parser:_ml_cri] invalid time format %Y-%m-%dT%H:%M:%S.%L%z for '2023-09-21T20:14:51.860372704Z'

Description of changes

Change the default log level in our samples from info to error

License

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Tests

I spun up a new cluster and deployed the enhanced CI yaml

ClusterName=fbtest
RegionName=us-east-1
FluentBitHttpPort='2020'
FluentBitReadFromHead='Off'
[[ ${FluentBitReadFromHead} = 'On' ]] && FluentBitReadFromTail='Off'|| FluentBitReadFromTail='On'
[[ -z ${FluentBitHttpPort} ]] && FluentBitHttpServer='Off' || FluentBitHttpServer='On'
curl https://raw.githubusercontent.com/chadpatel/amazon-cloudwatch-container-insights/3d1d46afc3f1d5eec419bdbf40ab365d0653c228/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/quickstart/cwagent-fluent-bit-quickstart-enhanced.yaml | sed 's/{{cluster_name}}/'${ClusterName}'/;s/{{region_name}}/'${RegionName}'/;s/{{http_server_toggle}}/"'${FluentBitHttpServer}'"/;s/{{http_server_port}}/"'${FluentBitHttpPort}'"/;s/{{read_from_head}}/"'${FluentBitReadFromHead}'"/;s/{{read_from_tail}}/"'${FluentBitReadFromTail}'"/' | kubectl apply -f - 

In 30 minutes the new cluster did not have any logs in /application coming from the fluentbit application ![Uploading Screenshot 2023-12-19 at 11.12.41 AM.png…]()

This is what I expect because the fluent-bit pod is also not logging

➜  amazon-cloudwatch-container-insights git:(fluentbit-error-logs) kubectl logs fluent-bit-nw46d -n amazon-cloudwatch
amazon-cloudwatch   fluent-bit-nw46d           1/1     Running   0             34m
Fluent Bit v1.9.10
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
* ```

amazon-cloudwatch fluent-bit-nw46d 1/1 Running 0 34m Fluent Bit v1.9.10

Requirements

Before committing the code, please verify the following:

yes this will have an impact on existing customer behavior. The amount of fluent-bit "application" logs will dramatically decrease. This will lower customers bills.