As a developer/operator of Notify, I would like my kubernetes logging to be as straight forward and hands off as possible so that logs will be consistently stored. Since switching from fluentd to fluentbit, we have had issues with logs not being consistently stored, and multi line parsing breaking.
WHY are we building?
Amazon has switched from fluentd to fluentbit for their cloudwatch kubernetes configuration.
A lot of time is being spent debugging/maintaining fluentbit, which is eating into development time to address OKRs.
WHAT are we building?
Determine if FluentD can still be used with the existing AWS Cloudwatch Agents and implement as necessary
VALUE created by our solution
A more robust logging solution for Kubernetes will increase stability and free up time for working on tasks related to OKRs.
Acceptance Criteria
[ ] Determine whether or not FluentD can still be used with AWS Cloudwatch Agent
[ ] Implement Fluentd
[ ] Validate that the issues we are experiencing with fluentbit are no longer present with fluentd
QA Steps
[ ] Verify all K8s logs are in cloudwatch
[ ] Restart all pods
[ ] Verify new pod logs are in cloudwatch
[ ] Use the new debug endpoint to create multiline errors on api, admin
[ ] Validate that multiline errors are working as expected in cloudwatch
[ ] Restart all pods
[ ] Validate that logs are still being sent to cloudwatch
Description
As a developer/operator of Notify, I would like my kubernetes logging to be as straight forward and hands off as possible so that logs will be consistently stored. Since switching from fluentd to fluentbit, we have had issues with logs not being consistently stored, and multi line parsing breaking.
WHY are we building?
Amazon has switched from fluentd to fluentbit for their cloudwatch kubernetes configuration. A lot of time is being spent debugging/maintaining fluentbit, which is eating into development time to address OKRs.
WHAT are we building?
Determine if FluentD can still be used with the existing AWS Cloudwatch Agents and implement as necessary
VALUE created by our solution
A more robust logging solution for Kubernetes will increase stability and free up time for working on tasks related to OKRs.
Acceptance Criteria
QA Steps