We are aws-fluent-bit to route the logs from our main container to Datadog. I was recently troubleshooting an issue where the ECS Fargate task was exiting due to one of its essential containers exiting but couldn't find any logs in Datadog indicating a failure. I then disabled fluent-bit logging so that the task logs go to CloudWatch. Then in the ECS Fargate console, when the the task exited, I was able to see the application log messages indicating the errors (required environment variables missing). I suspect the main container exited so fast that either fluent-bit did not receive the logs or did not ship them to Datadog before the task was terminated. How can I prevent this from happening?
Could you try the changes in Wesley's PR changing the Grace period behavior during shutdown (https://github.com/aws/aws-for-fluent-bit/pull/829)? That might be what kept the logs from getting ingested.
We are
aws-fluent-bit
to route the logs from our main container to Datadog. I was recently troubleshooting an issue where the ECS Fargate task was exiting due to one of its essential containers exiting but couldn't find any logs in Datadog indicating a failure. I then disabled fluent-bit logging so that the task logs go to CloudWatch. Then in the ECS Fargate console, when the the task exited, I was able to see the application log messages indicating the errors (required environment variables missing). I suspect the main container exited so fast that eitherfluent-bit
did not receive the logs or did not ship them to Datadog before the task was terminated. How can I prevent this from happening?Configuration
I am using AWS ECS Copilot Logging:
Fluent Bit Version Info
7.57.2
Cluster Details
ECS Fargate with Fluent Bit deployed as a Sidecar with awsvpc networking.