Seen in #451. I had a runner image with 2.310.2 and it was trying to update itself to 2.311.0. It somehow failed. That resulted in:
The self-hosted runner: CloudSnorkel-cdk-github-runners-1f43e9e0-740f-11ee-8cdb-8912db59 lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
Runner logs showed:
2023-10-27T12:41:32.708-04:00 Current runner version: '2.310.2'
2023-10-27T12:41:32.709-04:00 2023-10-27 16:41:32Z: Listening for Jobs
2023-10-27T12:41:44.940-04:00 Runner update in progress, do not shutdown runner.
2023-10-27T12:41:45.055-04:00 Downloading 2.311.0 runner
2023-10-27T12:42:32.261-04:00 Waiting for current job finish running.
2023-10-27T12:42:32.304-04:00 Generate and execute update script.
2023-10-27T12:42:32.368-04:00 Runner will exit shortly for update, should be back online within 10 seconds.
2023-10-27T12:42:32.379-04:00 Runner update process finished.
2023-10-27T12:42:33.136-04:00 Runner listener exit because of updating, re-launch runner after successful update
2023-10-27T12:43:04.027-04:00 Restarting runner...
2023-10-27T12:43:04.053-04:00 /home/runner/run-helper.sh: line 36: /home/runner/bin/Runner.Listener: No such file or directory
This consistently happened on Fargate x64, Fargate arm64, Fargate arm64 spot but not Fargate x64 spot. Fargate x64 and Fargate x64 spot use the same runner image.
While updating, the runner program moves bin to bin.OLD_VERSION and then symlinks bin to bin.NEW_VERSION. It does this for both bin and externals folder. In this case, the bin symlink was missing.
_diag logs showed no error creating the symlink (called junction because it's .NET):
Seen in #451. I had a runner image with 2.310.2 and it was trying to update itself to 2.311.0. It somehow failed. That resulted in:
Runner logs showed:
bin
tobin.OLD_VERSION
and then symlinksbin
tobin.NEW_VERSION
. It does this for bothbin
andexternals
folder. In this case, the bin symlink was missing._diag
logs showed no error creating the symlink (called junction because it's .NET):fargate-failed-update-no-bin.log docker-good-update.log