actions / actions-runner-controller

Kubernetes controller for GitHub Actions self-hosted runners
Apache License 2.0
4.4k stars 1.04k forks source link

Container pods fail with EACESS when using a custom user #3517

Closed kwohlfahrt closed 1 month ago

kwohlfahrt commented 1 month ago

Checks

Controller Version

0.9.0

Deployment Method

Helm

Checks

To Reproduce

1. Deploy the runner controller
2. Deploy a runner scale-set, using the `kubernetes` `containerMode`.
3. Configure `spec.securityContext.fsGroup: 1001`
  a. On the runner, using the `template` property of the Helm chart
  b. On the worker, using `ACTIONS_RUNNER_CONTAINER_HOOK_TEMPLATE`
4. Launch a workflow in a custom container, that specifies a user that is not root, and not 1001

Describe the bug

The worker fails, with this error: EACCES: permission denied, open '/__w/_temp/_runner_file_commands/set_env_8e7dea0f-bec9-4fd6-9b11-824b0bb16a6c'

When running id in the workflow pod, the container user is correctly added to the group 1001. The issue seems to be that the runner pod creates the files, but they are not writable by the runner group, only the runner user (note the mode is -rw-r--r--, not -rw-rw-r--:

$ ls -l /home/runner/_work/_temp/_runner_file_commands:
total 0
-rw-r--r-- 1 runner runner 0 May 14 17:23 add_path_3da96eb5-2ed4-41a6-b402-c9f1be15a554
-rw-r--r-- 1 runner runner 0 May 14 17:23 add_path_8e7dea0f-bec9-4fd6-9b11-824b0bb16a6c
-rw-r--r-- 1 runner runner 0 May 14 17:23 save_state_3da96eb5-2ed4-41a6-b402-c9f1be15a554
-rw-r--r-- 1 runner runner 0 May 14 17:23 save_state_8e7dea0f-bec9-4fd6-9b11-824b0bb16a6c
-rw-r--r-- 1 runner runner 0 May 14 17:23 set_env_3da96eb5-2ed4-41a6-b402-c9f1be15a554
-rw-r--r-- 1 runner runner 0 May 14 17:23 set_env_8e7dea0f-bec9-4fd6-9b11-824b0bb16a6c
-rw-r--r-- 1 runner runner 0 May 14 17:23 set_output_3da96eb5-2ed4-41a6-b402-c9f1be15a554
-rw-r--r-- 1 runner runner 0 May 14 17:23 set_output_8e7dea0f-bec9-4fd6-9b11-824b0bb16a6c
-rw-r--r-- 1 runner runner 0 May 14 17:23 step_summary_3da96eb5-2ed4-41a6-b402-c9f1be15a554
-rw-r--r-- 1 runner runner 0 May 14 17:23 step_summary_8e7dea0f-bec9-4fd6-9b11-824b0bb16a6c

Describe the expected behavior

I expect the container to be able to run, as long as it has the correct fsGroup applied. If I set instead securityContext.runAsUser: 1001, it gets further, but later fails because our functionality expects the UID to match what the image was built with.

Additional Context

Runner values.yaml:

containerMode:
  kubernetesModeWorkVolumeClaim:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 14Gi
    storageClassName: exclusive
  type: kubernetes
controllerServiceAccount:
  name: actions-runner-system-d3d990a5
  namespace: actions-runner-system-be6fdde6
githubConfigSecret: actions-runner-81cb830f
githubConfigUrl: https://github.com/CHARM-Tx
maxRunners: 3
minRunners: 1
template:
  spec:
    containers:
    - command:
      - /home/runner/run.sh
      env:
      - name: ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER
        value: "false"
      - name: ACTIONS_RUNNER_CONTAINER_HOOK_TEMPLATE
        value: /home/runner/templates/worker.yaml
      image: <snip>.dkr.ecr.eu-central-1.amazonaws.com/github-runner:2.315.0
      name: runner
      resources:
        limits:
          cpu: "1"
      volumeMounts:
      - mountPath: /home/runner/templates
        name: templates
    securityContext:
      fsGroup: 1001
    volumes:
    - configMap:
        name: templates-3892142c
      name: templates

templates ConfigMap:

apiVersion: v1
data:
  worker.yaml: '{"spec":{"securityContext":{"fsGroup":1001}}}'
kind: ConfigMap
metadata:
  name: templates-3892142c
  namespace: actions-runner-66769bad

This the same error message as #3505, except the container fsGroup is set using the hook.

Controller Logs

https://gist.github.com/kwohlfahrt/1d45d62aa963e4a4eec2ca6b04c2cc19

Runner Pod Logs

https://gist.github.com/kwohlfahrt/1d45d62aa963e4a4eec2ca6b04c2cc19 (note the runner logs are from a different run, I didn't manage to capture both at the same time).

github-actions[bot] commented 1 month ago

Hello! Thank you for filing an issue.

The maintainers will triage your issue shortly.

In the meantime, please take a look at the troubleshooting guide for bug reports.

If this is a feature request, please review our contribution guidelines.

nikola-jokic commented 1 month ago

Hey @kwohlfahrt,

This issue is not related to ARC. ARC is responsible for spinning up the runner and making sure it scales with the demand. Now, if you feel like this is a runner issue (because file permissions should be set to rw on the group as well), please submit it to the runner repository. As for the UID problem that fails after you are able to build an image, I don't have a good answer for you, unfortunately :disappointed:. Please raise this issue to the runner repository.