Closed alexlamsl closed 1 year ago
thanks @alexlamsl for creating a separate issue. Am I right that it is enough to simply fork https://github.com/mishoo/UglifyJS/ and run the following workflow to reproduce the problem? https://github.com/mishoo/UglifyJS/blob/master/.github/workflows/ufuzz.yml
Yes forking the repository and letting the aforementioned workflow run should be able to produce the (intermittent) issue.
You may want to edit out the Linux & Windows jobs to lighten the load since they don't exhibit the same issues.
Please be advised that the job may fail sometimes due to fuzzer hitting a false positive − but they would be distinctly different from the issue due to presence of logs.
Another bunch of recent samples: https://github.com/mishoo/UglifyJS/runs/2763713274?check_suite_focus=true https://github.com/mishoo/UglifyJS/runs/2760713866?check_suite_focus=true https://github.com/mishoo/UglifyJS/runs/2760250084?check_suite_focus=true https://github.com/mishoo/UglifyJS/runs/2757930927?check_suite_focus=true https://github.com/mishoo/UglifyJS/runs/2757074905?check_suite_focus=true
@alexlamsl thank you for provided examples, we are investigating the issue on our side to determine the exact reason for such behavior. Currently, I found only one thing, that tests consuming a lot of CPU resources on macOS machines, and possibly it leads the pipeline to fail. We need more time to find a root cause for this particular situation. I'll keep you informed.
Not sure if related, but just now I've hit an instance of this but on Windows: https://github.com/mishoo/UglifyJS/runs/2821517386?check_suite_focus=true
@alexlamsl thanks for the update! Windows is a pretty different story, so it's not related. Speaking about mac — we've narrowed down the list of the environments with issues, but unfortunately, we are still searching for the root cause.
Not sure if this helps, but this failed job contains some information under View raw logs: https://github.com/mishoo/UglifyJS/runs/3003410939?check_suite_focus=true
And from a glance it seems to got "cancelled".
Hi,
I have the same issue with macOS 11.
@FireFighter80 do you have access to the macOS-11 pipeline? Just to make sure it's not the access issue
@miketimofeev Thx. You're rights. That was the issue
@miketimofeev has this issue been resolved?
I am still getting steady stream of these job failures, especially in the past week on a daily basis.
@alexlamsl we haven't heard any cases so far that's why decided to close. Could you provide some new examples of such builds so I can escalate the issue?
Ones that are immediately relevant: https://github.com/mishoo/UglifyJS/actions/runs/2306997233 https://github.com/mishoo/UglifyJS/actions/runs/2296601772 https://github.com/mishoo/UglifyJS/actions/runs/2284012926
Others that fail unexpectedly, not sure if related: https://github.com/mishoo/UglifyJS/actions/runs/2281443778 https://github.com/mishoo/UglifyJS/actions/runs/2327083565 https://github.com/mishoo/UglifyJS/actions/runs/2300065092
P.S. for the past few days I would encounter Angry Unicorn ~5% of the time when loading any Actions-related URLs
I can replicate this on my fork as well: https://github.com/alexlamsl/UglifyJS/actions/runs/2310731271 https://github.com/alexlamsl/UglifyJS/actions/runs/2310324508 https://github.com/alexlamsl/UglifyJS/actions/runs/2309311098 https://github.com/alexlamsl/UglifyJS/actions/runs/2307865027 https://github.com/alexlamsl/UglifyJS/actions/runs/2264485393
Others: https://github.com/alexlamsl/UglifyJS/actions/runs/2277862274 https://github.com/alexlamsl/UglifyJS/actions/runs/2263136893
(ran into 🦄🦄🦄 while looking for these)
@alexlamsl thanks, will engage the engineering team
We're seeing the same issue very frequently with our Windows builds. There's no detailed errors to indicate what went wrong: https://github.com/Azure/bicep/actions/runs/2633252423
@miketimofeev - any update on this?
Just to chime in here, this is plaguing one of my repos as well, but on ubuntu-latest
image. Sadly it's not public, so I can't share any links or anything, but the affected workflow always fails around the 30min mark with either the error:
An error occurred while provisioning resources (Error Type: Disconnect).
Received request to deprovision: The request was cancelled by the remote provider.
or
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
Process completed with exit code 143.
The workflow is just a simple action running npm quicktype
. Occasionally, I will get some logs (as opposed to the typical no logs on the failing step that ran for 30min), but they only ever contain Killed\n
. This has been happening for the past 4 months
@niehusstaab even links to private repos will help as we don't need access to your repo to get the telemetry for the run and see if there was high CPU usage or something like that. Most likely this is the root cause.
We're seeing the same issue very frequently with our Windows builds. There's no detailed errors to indicate what went wrong: https://github.com/Azure/bicep/actions/runs/2633252423 ... @miketimofeev - any update on this?
To circle back - by chance I discovered that one of our tests was eating up a LOT of system memory, and this issue stopped occurring once I fixed it. It would be super helpful if this information could have been communicated somehow through the workflow logs, and would have saved a lot of time spent debugging.
If links to repos are still useful, here's a public action that failed with this specific error: https://github.com/SuffolkLITLab/ALActions/actions/runs/3913807100, running on ubuntu-latest
.
It's a really lightweight action that only runs ~20 lines of beautiful soup python on a small webpage, and normally finishes in < 30 seconds, so I'm fairly confident that it wouldn't be eating up memory or using a lot of CPU. The latest failing job took 26 minutes, but there aren't any logs to see what it was doing in that time.
what is the status in here? we have encountered similar problems, for more information, see #7004
+1 (ubuntu) One thing that would likely be useful would be to have a way to retrieve missing logs (see @anthony-c-martin comment above).
Due to the fact that virtually all of the cases mentioned here are related to resource consumption above what is possible, I am forced to close this request.
About logs: It is not possible for the moment to publish provisioner logs due to a pack of reasons including security reasons.
For the curious and new arrivals: I recommend paying attention to the discussion with a lot of information from users who encounter the same problem for various reasons: https://github.com/actions/runner-images/discussions/7188.
Description
Jobs on macOS would intermittently fail without logs, with the error message in title only appearing some of the time.
Here are a list of failed jobs over the past three days: https://github.com/mishoo/UglifyJS/runs/2730669944?check_suite_focus=true https://github.com/mishoo/UglifyJS/runs/2719886305?check_suite_focus=true https://github.com/mishoo/UglifyJS/runs/2718446501?check_suite_focus=true https://github.com/mishoo/UglifyJS/runs/2716605609?check_suite_focus=true https://github.com/mishoo/UglifyJS/runs/2712627146?check_suite_focus=true https://github.com/mishoo/UglifyJS/runs/2711652206?check_suite_focus=true
Not sure if related, whilst at lower frequency I also encountered jobs being reported as cancelled: https://github.com/mishoo/UglifyJS/runs/2706621544?check_suite_focus=true
Area for Triage: Deployment/Release
Question, Bug, or Feature?: Bug
Virtual environments affected
Image version Current runner version: '2.278.0'
Expected behavior
Jobs complete with viewable logs.
Actual behavior
Missing logs − even with
View raw logs
:Repro steps
A description with steps to reproduce the issue. If your have a public example or repo to share, please provide the link.