Closed korrem closed 3 weeks ago
Hi @korrem Thank you for bringing this issue to us. We are looking into this issue and will update you on this issue after investigating.
Hi @korrem- I am unable to open the url link which you have provided as it shows '404 error'. However, from your description i can clearly see that the issue you are experiencing with the Post Run actions/checkout@v4 step, which randomly takes a long time or fails due to runner disconnection, could be related to various factors like runner resource limitations, network instability, or GitHub service issues.
For you, i am providing some recommendations to help mitigate the problem:-
A. You can add a retry mechanism to the actions/checkout@v4 step to handle random failures. GitHub Actions supports continue-on-error and retry options to prevent the job from completely failing.
- name: Checkout Code
uses: actions/checkout@v4
with:
fetch-depth: 0
continue-on-error: true
B. You can also try to check if the runner timeout is set too aggressively. Increasing the runner timeout might prevent early termination.
timeout-minutes: 30 # Example to increase timeout if needed
C. Adding to this, If the issue persists and is critical, consider using a self-hosted runner with more control over resource allocation and network stability. This might avoid disconnection errors.
D. Git Shallow Clone: To reduce the time spent in the checkout step, ensure that you're not fetching unnecessary history.
- uses: actions/checkout@v4
with:
fetch-depth: 1 # Fetch only the latest commit
E. Also, Since the error mentions loss of communication with the server, add network-related logging or monitoring to see if there are spikes in network latency or drops that might be affecting the workflow.
Hopefully, these changes should help improve the stability of the actions/checkout@v4 step.
Prabhatkumar59 thanks for your message. I'll try options A and D, and if they don't help then the rest. I will let you know if it helped
Hi @korrem - Sure let me know, hopefully those changes which I provided to you should help improve the stability.
Hi @korrem - Since we haven't heard back, we'll assume your issue is resolved and will close this issue for now. Feel free to reach out to us for any other queries. Thanks.
Hi Prabhatkumar59, Apologies for the long wait with information on the results, unfortunately, none of your advice helped.
timeout-minutes
only increased the running time of the entire workflow. The post-run checkout action usually takes a maximum of 2 seconds, but it didn’t help — it only prolonged the agony.fetch-dept
to 1 also didn’t help.Strange thing is that I see all steps passed except last one (Post Run actions/checkout@v4) but in logs we are seeing like hosted-runner didn't start entire job LOGS: . Workflow screenshot
I will be grateful for any other help?
Description
Hi,
For two at least two months we have noticed that our nightly runs a problem that occurs randomly. Sometimes the last step which is
Post Run actions/checkout@v4
can take a very long time, up to 15 minutes, after which we get the workflow is either skipped or failed.For skipped we get error massage
Hosted runner encountered an error while running your job. (Error type: Disconnect).
. Example can be found here - https://github.com/IMGARENA/multisport-fastpath-scoring-app/actions/runs/10765405236For failed we get error massage
Hosted runner: GitHub Actions 94 has lost communication with the server. Anything in the workflow that terminates the runner's process, deprives it of CPU/memory or blocks network access can cause this error.
- here you can see an example - https://github.com/IMGARENA/multisport-fastpath-scoring-app/actions/runs/10712726268.We have added a step in which we monitor CPU and RAM consumption. However, so far the highest CPU consumption has been a maximum of 10% and the available RAM is around 6GB after the tests have been completed. Here you can see our workflow file -> https://github.com/IMGARENA/multisport-fastpath-scoring-app/blob/develop/.github/workflows/run-e2e-tests.yml and workflow for nightly https://github.com/IMGARENA/multisport-fastpath-scoring-app/blob/develop/.github/workflows/nightly-e2e-tests-without-comparator.yml.
Could you be so kind and help us to resolve this issue?
Platforms affected
Runner images affected
Image version and build link
Version: 20240908.1.0
Is it regression?
https://github.com/IMGARENA/multisport-fastpath-scoring-app/actions/runs/10821818651
Expected behavior
Post Run actions/checkout@v4 step shouldn't take so much time and should finish successfully
Actual behavior
Post Run actions/checkout@v4
step at the end of the workflow takes sometimes even 15 minutes and then fails or skips the whole workflow.Repro steps