actions / runner-container-hooks

Runner Container Hooks for GitHub Actions
MIT License
63 stars 41 forks source link

Workflow containers to run for runners in K8s mode take minutes to start up #167

Open MarcoDalco opened 1 month ago

MarcoDalco commented 1 month ago

We are running workflows on self-hosted runners in Kubernetes mode. The workflow is executed successfully, but it spends more than 3 minutes in the Initialise containers step. We have been able to detect the cause of the slowdown to be the action of copying files across from the runner to the pod defined in the PodTemplate. In fact the step took just 4-5 seconds when we removed the calls await copyExternalsToRoot() and await isAuthPermissionsOK() and replaced them with a bash copy command cp -R /home/runner/externals /home/runner/_work/externals before running /home/runner/run.sh in the Dockerfile image created based on the runner version ghcr.io/actions/actions-runner:2.317.0 where we only add a user with ID 1000 and make the necessary changes for the above copy command. Our cluster is in AWS and uses EFS as PersistentStorageType in ReadWriteMany access mode for the work volume shared by the Runner and the workflow pod.

We believe that the approach K8s mode without PV [DRAFT] #160 could solve this problem even better than our approach replacing the way to copy those files. I'm opening this ticket to keep track of the issue, and to signal that it's actually affecting us.

ChristopherHX commented 4 weeks ago

In my local minikube setup for already pulled images it was like number of parallel jobs for the runners multiplied by 1min. Matrix with 4 parallel jobs on a single node took each ca. 4min to start and I thought my test Hardware is just too weak

At least I know now what I could tweak locally to make it faster, thank you :)