Open mic-kul opened 2 weeks ago
Created a pull request to use nextForwardToken
instead of two consecutive empty event list to determine EOF when pulling CloudWatch log.
Regarding the concern why this was only observed when using CodeBuild Compute Fleet (not on-demand mode), I think it might be related with how/when CloudWatch agent is pushing the log from instance to CloudWatch service. For example, CloudWatch agent has some configuration such as force_flush_interval
. When using CodeBuild on-demand compute resource, the ec2 instance would be terminated right after the build is complete, and CloudWatch agent would push everything in memory to the CloudWatch service without waiting during the termination/shutdown process. However, with CodeBuild compute fleet mode, you would get a reserved ec2 instance capacity, it won't be terminated after the build, hence CloudWatch agent would honor such configuration to determine when to push the log to CloudWatch service next time. It seems to be a timing issue in certain scenarios.
Thank you @shuohaoliu.
For the context I will add another oddity we've noticed is that all logs in Cloudwatch, when using CodeBuild Compute Fleet, have the same timestamp attached, and that is the timestamp of very first log message. No matter how much sleep
we add in bash ;)
We've already raised this with AWS Premium Support and it was escalated with CodeBuild team.
Issue should have been fixed
Hi,
We're encountering an issue with the
aws-codebuild-run-build
action when using CodeBuild Compute fleets: the logs are missing, whenever generating output takes more than 60s (default updateInterval 30s).I've checked CloudWatch GetLogEvents metrics and found no errors.
We run this with default update interval of 30 seconds.
First, I thought we were encountering the condition described in this section of the code:
However, it doesn't make sense that everything works as expected when running on On Demand builders, and the issue occurs only when we run the build on CodeBuild Compute Fleet.
The minimal buildspec to reproduce the issue:
Example:
When running CodeBuild On Demand started by this Github Action, GHA outputs:
When running CodeBuild Compute fleets started by this Github Action, CB&GHA output:
in progress:
finished:
Is there anything that can be done to try to pull all missing logs again, once "CODEBUILD COMPLETE" signal is received?