actions / runner

The Runner for GitHub Actions :rocket:
https://github.com/features/actions
MIT License
4.67k stars 902 forks source link

`Load failed: 5: Input/output error` on ./svc.sh run | Mac Big Sur Issues? #1056

Open callum-tait-pbx opened 3 years ago

callum-tait-pbx commented 3 years ago

Describe the bug Just deployed 2 new self-hosted runners on Mac Minis, I get the error Load failed: 5: Input/output error when I install and then start the runner as a service. Interestingly the service appears to start and the runner appears online on github.com Silicon Intel OS Version Big Sur 11.2.3

To Reproduce

  1. Install the latest self-hosted runner (v2.278.0 at the time of writing) and setup the runner ./config.sh --url $URL --token $TOKEN
  2. Install the runner as a service ./svc.sh install
  3. Run the service ./svc.sh run

Expected behavior No error is posted

Runner Version and Platform

Silicon Intel OS Version Big Sur 11.2.3 Runner Version v2.278.0

What's not working?

The runner appears to register with github and take work, currently testing to see if there is any functional issues, will update

Job Log Output

I've replaced our actual org and hostname values with $org and $hostname

starting actions.runner.$org.$hostname
Load failed: 5: Input/output error
status actions.runner.$org.$hostname:

/Users/runner/Library/LaunchAgents/actions.runner.$org.$hostname.plist

Started:
512 0 actions.runner.$org.$hostname

Runner and Worker's Diagnostic Logs

If you google load failed 5 input/output error you will find lots of posts about this error on a wide array of software being installed as a service with lots of references to Big Sur. Something has changed in relation to Big Sur and services and the plist that gets auto generated by the service install script and / or the values within the plist no longer being valid for some reason.

Example Search Results https://developer.apple.com/forums/thread/665661 https://www.reddit.com/r/MacOS/comments/kbko61/launchctl_broken/ https://www.reddit.com/r/MacOS/comments/kbko61/launchctl_broken/gpv2to1/

EDIT

It looks like launchctl load | unload are now legacy commands, the svc.sh script probably needs looking to setup / run the service in a > Big sur compliant way?

https://ss64.com/osx/launchctl.html

Subcommands from the previous implementation of launchd are generally
available, though some may be unimplemented. Unimplemented subcommands
are documented as such.

load | unload [-wF] [-S sessiontype] [-D searchpath] paths ...
        Load the specified configuration files or directories of
        configuration files.  Jobs that are not on-demand will be started
        as soon as possible. All specified jobs will be loaded before any
        of them are allowed to start. Note that per-user configuration
        files (LaunchAgents) must be owned by root (if they are located
        in /Library/LaunchAgents) or the user loading them (if they are
        located in $HOME/Library/LaunchAgents).  All system-wide daemons
        (LaunchDaemons) must be owned by root. Configuration files must
        disallow group and world writes. These restrictions are in place
        for security reasons, as allowing writability to a launchd
        configuration file allows one to specify which executable will be
        launched.
TingluoHuang commented 3 years ago

🤦 we need to test the runner on Big sur and verify the behavior. @hross FYI, since we can't upgrade our MacBook to Big Sur yet... 😢

liya2017 commented 3 years ago

@callum-tait-pbx Hi,have you solved the problem? I met the same issue on the Mac M1

callum-tait-pbx commented 3 years ago

I haven't got around to trying to solve the issue personally, functionally they seem to work but it's definitely an issue that needs solving. Based on my research at the time I think it relates to the deprecation of the load command. If I get a chance next week I'll mess with the svc.sh script.

brandonbirdj commented 3 years ago

I had a similar sounding issue that does appear to be related to the launchctl update issues. From what I understand on https://babodee.wordpress.com/2016/04/09/launchctl-2-0-syntax/ using load is now replaced with bootstrap however that alone was not my issue.

The issue I found is that load and bootstrap both require the user to be in a gui session to configure them. I originally was only configuring over ssh. Running ./svc.sh start over vnc in a terminal session worked for me.

I would like to see the setup script modified to use the newer launchctl settings and configure it to run in the user/ domain instead of gui/ that way it can be configured solely over ssh.

jveldboom commented 2 years ago

As a workaround, our team was able to run the following commands unattended on an AWS Mac EC2 instance. We put the following within the instance's user-data. The runsvc.sh script start the agent but in the foreground but adding the & moves the process to the background. Not the most ideal but for now seems to work for us.

sudo su -- ec2-user ./svc.sh install
sudo su -- ec2-user ./runsvc.sh start &
brandonbirdj commented 2 years ago

I've opened https://github.com/actions/runner/pull/1102 with a long term solution. Perhaps if we could get some more 👍 they will consider merging it.

NorseGaud commented 2 years ago

Hit this too.... macOS 11.6, EC2 Mac

seemethere commented 2 years ago

Can confirm that we are also hitting this for our runners as well for pytorch

woshahua commented 2 years ago

hit same problem, Mac M1 on Big Sur

jordanb-afs commented 2 years ago

i've been using this to fix it:

sed -i -Ee 's;(launchctl) load -w;\1 bootstrap gui/$user_id;' -e 's;(launchctl) unload;\1 bootout gui/$user_id;' -e 's;(launchctl) list;\1 print gui/$user_id;' svc.sh

da1rren commented 2 years ago

i've been using this to fix it:

sed -i -Ee 's;(launchctl) load -w;\1 bootstrap gui/$user_id;' -e 's;(launchctl) unload;\1 bootout gui/$user_id;' -e 's;(launchctl) list;\1 print gui/$user_id;' svc.sh

@jordanb-afs That seems to be broken

jordanb-afs commented 2 years ago

i've been using this to fix it: sed -i -Ee 's;(launchctl) load -w;\1 bootstrap gui/$user_id;' -e 's;(launchctl) unload;\1 bootout gui/$user_id;' -e 's;(launchctl) list;\1 print gui/$user_id;' svc.sh

@jordanb-afs That seems to be broken

hmm, it's still working on my latest builds. I run ./config.sh first, then that on top. Make sure the $user_id isn't getting replaced by your shell or etc, you could also just put the actual user id there. You'll also need to setup the runners to auto-login to the gui, otherwise it may not be able to bootstrap into gui/$user_id (which is usually gui/501 in my case)

aronchick commented 2 years ago

seems to be broken for me too -

m1@848070ff-3717-456a-a062-daf9e05bb634 actions-runner % sed -i -Ee 's;(launchctl) load -w;\1 bootstrap gui/501;' -e 's;(launchctl) unload;\1 bootout gui/501;' -e 's;(launchctl) list;\1 print gui/501;' svc.sh
sed: 1: "s;(launchctl) load -w;\ ...": \1 not defined in the RE
c3-kaspesi commented 2 years ago

This is also an issue with monterey on ec2 mac metal instance

Command: ./svc.sh start Result:

starting actions.runner.X
Load failed: 5: Input/output error
Try running `launchctl bootstrap` as root for richer errors.
status actions.runner.X:

/Users/ec2-user/Library/LaunchAgents/actions.runner.X.plist

Stopped

ec2-user@X actions-runner % launchctl list | grep action
(Didnt start)
c3-kaspesi commented 2 years ago

i've been using this to fix it:

sed -i -Ee 's;(launchctl) load -w;\1 bootstrap gui/$user_id;' -e 's;(launchctl) unload;\1 bootout gui/$user_id;' -e 's;(launchctl) list;\1 print gui/$user_id;' svc.sh

Running headlessly so won't have access to gui, has anything else worked?

c3-kaspesi commented 2 years ago

@jordanb-afs Do you have any additional information about making a headless mac auto-login to gui?

sdarwin commented 1 year ago

It would be great if there were a purely ssh solution that didn't depend on the GUI at all. In the meantime, this can be helpful: go to System Preferences -> Users and Groups -> Login Options -> Automatic Login, and set the 'runner' or 'gha' user to login automatically. Reboot. The user will login at boot time and the services will start.

c3-kaspesi commented 1 year ago

@sdarwin Will try this thanks, do you know if this works for all GUI issues

sebsto commented 1 year ago

A workaround that worked for me is to install the plist file in /Library/LaunchDaemons and start the service with

sudo /bin/launchctl load /Library/LaunchDaemons/actions.runner.sebsto-xcodeinstall.plist

My plist is as follow

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
  <dict>
    <key>Label</key>
    <string>actions.runner.sebsto-xcodeinstall.ip-172-31-67-99</string>
    <key>ProgramArguments</key>
    <array>
      <string>/Users/ec2-user/actions-runner/runsvc.sh</string>
    </array>
    <key>UserName</key>
    <string>ec2-user</string>
    <key>GroupName</key>
    <string>staff</string>  
    <key>WorkingDirectory</key>
    <string>/Users/ec2-user/actions-runner</string>
    <key>RunAtLoad</key>
    <true/>    
    <key>StandardOutPath</key>
    <string>/Users/ec2-user/Library/Logs/actions.runner.sebsto-xcodeinstall.ip-172-31-67-99/stdout.log</string>
    <key>StandardErrorPath</key>
    <string>/Users/ec2-user/Library/Logs/actions.runner.sebsto-xcodeinstall.ip-172-31-67-99/stderr.log</string>
    <key>EnvironmentVariables</key>
    <dict> 
      <key>ACTIONS_RUNNER_SVC</key>
      <string>1</string>
    </dict>
<!--
    <key>ProcessType</key>
    <string>Interactive</string>
-->
    <key>SessionCreate</key>
    <true/>
  </dict>
</plist>
phatblat commented 1 year ago

@sebsto has the right idea. Real services start on boot, not on login so on macOS this means the service needs to be a LaunchDaemon and not a LaunchAgent

freef4ll commented 9 months ago

There is more problems on Sonoma, what worked on Ventura fine to start runner agent on boot, has issues with unlocking keychain keys for code signing.

With LimitLoadToSessionType no longer able to start the service completely...

muojp commented 7 months ago

I also found we cannot properly access Keychain while keeping unattended (no-GUI login) because code-signing during Xcode build even fails on our machines on non-Sonoma (Ventura 13.6) in certain configuration.

Of course we tried LaunchDaemons w/ ProcessType commented out like https://github.com/actions/runner/issues/1056#issuecomment-1237426462 but still failed.

We ended up in associating GitHub Actions self-hosted Runner process w/ GUI sessions like this way:

This is kinda last-resort-nasty-hack and not a preferred solution because it needs a GUI session manual operation on every reboot but at least it works on our env for dealing with Keychain.

sebsto commented 7 months ago

@muojp this is unecessary. I have GitHub agent + LaunchDaemons configuration up and running and serving dozens of builds per day. This runs on a totally unattended Amazon EC2 Mac instance on which I never connected with a GUI.

Here is the launch daemon plist file I use

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
  <dict>
    <key>Label</key>
    <string>actions.runner.sebsto-xcodeinstall.xcodeinstall</string>
    <key>ProgramArguments</key>
    <array>
      <string>/Users/ec2-user/actions-runner-xcodeinstall/run.sh</string>
    </array>
    <key>UserName</key>
    <string>ec2-user</string>
    <key>WorkingDirectory</key>
    <string>/Users/ec2-user/actions-runner-xcodeinstall</string>
    <key>RunAtLoad</key>
    <true/>    
    <key>StandardOutPath</key>
    <string>/Users/ec2-user/Library/Logs/actions.runner.sebsto-xcodeinstall.xcodeinstall/stdout.log</string>
    <key>StandardErrorPath</key>
    <string>/Users/ec2-user/Library/Logs/actions.runner.sebsto-xcodeinstall.xcodeinstall/stderr.log</string>
    <key>EnvironmentVariables</key>
    <dict> 
      <key>ACTIONS_RUNNER_SVC</key>
      <string>1</string>
    </dict>
<!--
    <key>ProcessType</key>
    <string>Interactive</string>
-->
    <key>SessionCreate</key>
    <true/>
  </dict>
</plist>

And this is how I prepare the keychain

https://github.com/sebsto/amplify-ios-getting-started/blob/main/code/ci_actions/01_keychain.sh

NorseGaud commented 7 months ago

Hey @sebsto , is ProcessType meant to be commented out? Or are you just directing attention to it and we need to remove <!--?

sebsto commented 7 months ago

@NorseGaud Sorry for the lack of accuracy here. It is an actual copy of the file on my system, I left it commented out. I should have delete it for clarity.

nadenf commented 4 months ago

Still getting the same: "Load failed: 5: Input/output error" error on Sonoma.

Even tried the LaunchDaemon solution and it doesn't change anything.