actions / actions-runner-controller

Kubernetes controller for GitHub Actions self-hosted runners
Apache License 2.0
4.59k stars 1.09k forks source link

Cannot attach an image to just the runner #2955

Closed k-walsh-gmg closed 11 months ago

k-walsh-gmg commented 11 months ago

Checks

Controller Version

0.6.1

Helm Chart Version

gha-runner-scale-set-controller-0.6.1

CertManager Version

v1.12.0

Deployment Method

Helm

cert-manager installation

Same installation method with helm just an older version

Checks

Resource Definitions

## githubConfigUrl is the GitHub url for where you want to configure runners
## ex: https://github.com/myorg/myrepo or https://github.com/myorg
githubConfigUrl: "https://github.com/Gallery-Media-Group"

## githubConfigSecret is the k8s secrets to use when auth with GitHub API.
## You can choose to use GitHub App or a PAT token
githubConfigSecret:
  ### GitHub Apps Configuration
  ## NOTE: IDs MUST be strings, use quotes
  #github_app_id: ""
  #github_app_installation_id: ""
  #github_app_private_key: |

  ### GitHub PAT Configuration
  github_token: ""
## If you have a pre-define Kubernetes secret in the same namespace the gha-runner-scale-set is going to deploy,
## you can also reference it via `githubConfigSecret: pre-defined-secret`.
## You need to make sure your predefined secret has all the required secret data set properly.
##   For a pre-defined secret using GitHub PAT, the secret needs to be created like this:
##   > kubectl create secret generic pre-defined-secret --namespace=my_namespace --from-literal=github_token='ghp_your_pat'
##   For a pre-defined secret using GitHub App, the secret needs to be created like this:
##   > kubectl create secret generic pre-defined-secret --namespace=my_namespace --from-literal=github_app_id=123456 --from-literal=github_app_installation_id=654321 --from-literal=github_app_private_key='-----BEGIN CERTIFICATE-----*******'
# githubConfigSecret: pre-defined-secret

## proxy can be used to define proxy settings that will be used by the
## controller, the listener and the runner of this scale set.
#
# proxy:
#   http:
#     url: http://proxy.com:1234
#     credentialSecretRef: proxy-auth # a secret with `username` and `password` keys
#   https:
#     url: http://proxy.com:1234
#     credentialSecretRef: proxy-auth # a secret with `username` and `password` keys
#   noProxy:
#     - example.com
#     - example.org

# maxRunners is the max number of runners the autoscaling runner set will scale up to.
maxRunners: 10

# minRunners is the min number of runners the autoscaling runner set will scale down to.
minRunners: 1

# runnerGroup: "default"

## name of the runner scale set to create.  Defaults to the helm release name
# runnerScaleSetName: ""

## A self-signed CA certificate for communication with the GitHub server can be
## provided using a config map key selector. If `runnerMountPath` is set, for
## each runner pod ARC will:
## - create a `github-server-tls-cert` volume containing the certificate
##   specified in `certificateFrom`
## - mount that volume on path `runnerMountPath`/{certificate name}
## - set NODE_EXTRA_CA_CERTS environment variable to that same path
## - set RUNNER_UPDATE_CA_CERTS environment variable to "1" (as of version
##   2.303.0 this will instruct the runner to reload certificates on the host)
##
## If any of the above had already been set by the user in the runner pod
## template, ARC will observe those and not overwrite them.
## Example configuration:
#
# githubServerTLS:
#   certificateFrom:
#     configMapKeyRef:
#       name: config-map-name
#       key: ca.crt
#   runnerMountPath: /usr/local/share/ca-certificates/

## Container mode is an object that provides out-of-box configuration
## for dind and kubernetes mode. Template will be modified as documented under the
## template object.
##
## If any customization is required for dind or kubernetes mode, containerMode should remain
## empty, and configuration should be applied to the template.
containerMode:
  type: "dind"  ## type can be set to dind or kubernetes
#   ## the following is required when containerMode.type=kubernetes
#   kubernetesModeWorkVolumeClaim:
#     accessModes: ["ReadWriteOnce"]
#     # For local testing, use https://github.com/openebs/dynamic-localpv-provisioner/blob/develop/docs/quickstart.md to provide dynamic provision volume with storageClassName: openebs-hostpath
#     storageClassName: "dynamic-blob-storage"
#     resources:
#       requests:
#         storage: 1Gi

## template is the PodSpec for each listener Pod
## For reference: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#PodSpec
# listenerTemplate:
#   spec:
#     containers:
#     # Use this section to append additional configuration to the listener container.
#     # If you change the name of the container, the configuration will not be applied to the listener,
#     # and it will be treated as a side-car container.
#     - name: listener
#       securityContext:
#         runAsUser: 1000
#     # Use this section to add the configuration of a side-car container.
#     # Comment it out or remove it if you don't need it.
#     # Spec for this container will be applied as is without any modifications.
#     - name: side-car
#       image: example-sidecar

## template is the PodSpec for each runner Pod
## For reference: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#PodSpec
template:
  # template.spec will be modified if you change the container mode
  # with containerMode.type=dind, we will populate the template.spec with following pod spec
  template:
    spec:
      initContainers:
      - name: init-dind-externals
        image: ghcr.io/actions/actions-runner:latest
        command: ["cp", "-r", "-v", "/home/runner/externals/.", "/home/runner/tmpDir/"]
        volumeMounts:
          - name: dind-externals
            mountPath: /home/runner/tmpDir
      containers:
      - name: runner
        image: CUSTOM_IMAGE
        command: ["/home/runner/run.sh"]
        env:
          - name: DOCKER_HOST
            value: unix:///run/docker/docker.sock
        volumeMounts:
          - name: work
            mountPath: /home/runner/_work
          - name: dind-sock
            mountPath: /run/docker
            readOnly: true
      - name: dind
        image: docker:dind
        args:
          - dockerd
          - --host=unix:///run/docker/docker.sock
          - --group=$(DOCKER_GROUP_GID)
        env:
          - name: DOCKER_GROUP_GID
            value: "123"
        securityContext:
          privileged: true
        volumeMounts:
          - name: work
            mountPath: /home/runner/_work
          - name: dind-sock
            mountPath: /run/docker
          - name: dind-externals
            mountPath: /home/runner/externals
      volumes:
      - name: work
        emptyDir: {}
      - name: dind-sock
        emptyDir: {}
      - name: dind-externals
        emptyDir: {}
  ######################################################################################################
  ## with containerMode.type=kubernetes, we will populate the template.spec with following pod spec
  # template:
  #   spec:
  #     containers:
  #     - name: runner
  #       image: 303701770803.dkr.ecr.us-east-2.amazonaws.com/github:latest
  #       command: ["/home/runner/run.sh"]
  #       env:
  #         - name: ACTIONS_RUNNER_CONTAINER_HOOKS
  #           value: /home/runner/k8s/index.js
  #         - name: ACTIONS_RUNNER_POD_NAME
  #           valueFrom:
  #             fieldRef:
  #               fieldPath: metadata.name
  #         - name: ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER
  #           value: "true"
  #       volumeMounts:
  #         - name: work
  #           mountPath: /home/runner/_work
  #     volumes:
  #       - name: work
  #         ephemeral:
  #           volumeClaimTemplate:
  #             spec:
  #               accessModes: [ "ReadWriteOnce" ]
  #               storageClassName: "local-path"
  #               resources:
  #                 requests:
  #                   storage: 1Gi
  # spec:
  #   initContainers:
  #     - name: init-dind-externals
  #       image: ghcr.io/actions/actions-runner:latest
  #       command: ["cp", "-r", "-v", "/home/runner/externals/.", "/home/runner/tmpDir/"]
  #       volumeMounts:
  #         - name: dind-externals
  #           mountPath: /home/runner/tmpDir
  #   containers:
  #     - name: runner
  #       image: CUSTOM_IMAGE
  #       command: ["/home/runner/run.sh"]

## Optional controller service account that needs to have required Role and RoleBinding
## to operate this gha-runner-scale-set installation.
## The helm chart will try to find the controller deployment and its service account at installation time.
## In case the helm chart can't find the right service account, you can explicitly pass in the following value
## to help it finish RoleBinding with the right service account.
## Note: if your controller is installed to only watch a single namespace, you have to pass these values explicitly.
# controllerServiceAccount:
#   namespace: arc-system
#   name: test-arc-gha-runner-scale-set-controller

To Reproduce

I attached the custom image to the runner via the template, it is ignored and just attaches the default image. I attached the image to the official pod spec which causes my image to be attached to the init container which is not needed. I also attempted to copy the template down to pod spec which causes errors with the controller.

Describe the bug

Please see above

Describe the expected behavior

The container does not attach to the init container and is specific to the runner container

Whole Controller Logs

Cannot post(work)

Whole Runner Pod Logs

Cannot post(work) also depending on the scenario the pods are started and deleted because I am attempting to specify paths which have already been specified else where

Additional Context

No response

github-actions[bot] commented 11 months ago

Hello! Thank you for filing an issue.

The maintainers will triage your issue shortly.

In the meantime, please take a look at the troubleshooting guide for bug reports.

If this is a feature request, please review our contribution guidelines.

k-walsh-gmg commented 11 months ago

Additionally this happened after turning off dind and also when putting the the configuration in the spec outside of the template. Also, while attempting to use a custom image with the kubernetes config(turned off as well) in the spec it still pulls the default image

wherka-ama commented 11 months ago

@k-walsh-gmg : do you mind attaching the output from the helm template generation please?

Note: You should redact all the sensitive info before pasting it here

If you're not sure how to do it, here is the recipe.

Assuming you followed the installation guide:

helm repo add actions-runner-controller https://actions-runner-controller.github.io/actions-runner-controller
helm upgrade --install --namespace actions-runner-system --create-namespace \
             --wait actions-runner-controller actions-runner-controller/actions-runner-controller --values=<your values file>

All you need is to replace the upgrade --install with template

helm template --namespace actions-runner-system --create-namespace \
             --wait actions-runner-controller actions-runner-controller/actions-runner-controller --values=<your values file>

We need to be sure that the generated manifests are sane first. Then we can go further with the investigations.

k-walsh-gmg commented 11 months ago

This was installed using this guide which deploys the scale set controller as well as the runner scaler set. https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners-with-actions-runner-controller/quickstart-for-actions-runner-controller . Not too sure if this is officially on helm for me to do it the way you wanted me to but will supplement with what I have to work with :)

---
# Source: gha-runner-scale-set-controller/templates/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: arc-gha-rs-controller
  namespace: arc-systems
  labels:
    helm.sh/chart: gha-rs-controller-0.6.1
    app.kubernetes.io/name: gha-rs-controller
    app.kubernetes.io/namespace: arc-systems
    app.kubernetes.io/instance: arc
    app.kubernetes.io/version: "0.6.1"
    app.kubernetes.io/part-of: gha-rs-controller
    app.kubernetes.io/managed-by: Helm
---
# Source: gha-runner-scale-set-controller/templates/manager_cluster_role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: arc-gha-rs-controller
rules:
- apiGroups:
  - actions.github.com
  resources:
  - autoscalingrunnersets
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - actions.github.com
  resources:
  - autoscalingrunnersets/finalizers
  verbs:
  - patch
  - update
- apiGroups:
  - actions.github.com
  resources:
  - autoscalingrunnersets/status
  verbs:
  - get
  - patch
  - update
- apiGroups:
  - actions.github.com
  resources:
  - autoscalinglisteners
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - actions.github.com
  resources:
  - autoscalinglisteners/status
  verbs:
  - get
  - patch
  - update
- apiGroups:
  - actions.github.com
  resources:
  - autoscalinglisteners/finalizers
  verbs:
  - patch
  - update
- apiGroups:
  - actions.github.com
  resources:
  - ephemeralrunnersets
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - actions.github.com
  resources:
  - ephemeralrunnersets/status
  verbs:
  - get
  - patch
  - update
- apiGroups:
  - actions.github.com
  resources:
  - ephemeralrunnersets/finalizers
  verbs:
  - patch
  - update
- apiGroups:
  - actions.github.com
  resources:
  - ephemeralrunners
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - actions.github.com
  resources:
  - ephemeralrunners/finalizers
  verbs:
  - patch
  - update
- apiGroups:
  - actions.github.com
  resources:
  - ephemeralrunners/status
  verbs:
  - get
  - patch
  - update
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - serviceaccounts
  verbs:
  - list
  - watch
- apiGroups:
  - rbac.authorization.k8s.io
  resources:
  - rolebindings
  verbs:
  - list
  - watch
- apiGroups:
  - rbac.authorization.k8s.io
  resources:
  - roles
  verbs:
  - list
  - watch
  - patch
---
# Source: gha-runner-scale-set-controller/templates/manager_cluster_role_binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: arc-gha-rs-controller
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: arc-gha-rs-controller
subjects:
- kind: ServiceAccount
  name: arc-gha-rs-controller
  namespace: arc-systems
---
# Source: gha-runner-scale-set-controller/templates/manager_listener_role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: arc-gha-rs-controller-listener
  namespace: arc-systems
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - create
  - delete
  - get
- apiGroups:
  - ""
  resources:
  - pods/status
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - secrets
  verbs:
  - create
  - delete
  - get
  - patch
  - update
- apiGroups:
  - ""
  resources:
  - serviceaccounts
  verbs:
  - create
  - delete
  - get
  - patch
  - update
---
# Source: gha-runner-scale-set-controller/templates/manager_listener_role_binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: arc-gha-rs-controller-listener
  namespace: arc-systems
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: arc-gha-rs-controller-listener
subjects:
- kind: ServiceAccount
  name: arc-gha-rs-controller
  namespace: arc-systems
---
# Source: gha-runner-scale-set-controller/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: arc-gha-rs-controller
  namespace: arc-systems
  labels:
    helm.sh/chart: gha-rs-controller-0.6.1
    app.kubernetes.io/name: gha-rs-controller
    app.kubernetes.io/namespace: arc-systems
    app.kubernetes.io/instance: arc
    app.kubernetes.io/version: "0.6.1"
    app.kubernetes.io/part-of: gha-rs-controller
    app.kubernetes.io/managed-by: Helm
    actions.github.com/controller-service-account-namespace: arc-systems
    actions.github.com/controller-service-account-name: arc-gha-rs-controller
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: gha-rs-controller
      app.kubernetes.io/namespace: arc-systems
      app.kubernetes.io/instance: arc
  template:
    metadata:
      annotations:
        kubectl.kubernetes.io/default-container: "manager"
      labels:
        app.kubernetes.io/part-of: gha-rs-controller
        app.kubernetes.io/component: controller-manager
        app.kubernetes.io/version: 0.6.1
        app.kubernetes.io/name: gha-rs-controller
        app.kubernetes.io/namespace: arc-systems
        app.kubernetes.io/instance: arc
    spec:
      serviceAccountName: arc-gha-rs-controller
      containers:
      - name: manager
        image: "ghcr.io/actions/gha-runner-scale-set-controller:0.6.1"
        imagePullPolicy: IfNotPresent
        args:
        - "--auto-scaling-runner-set-only"
        - "--log-level=debug"
        - "--log-format=text"
        - "--update-strategy=immediate"
        - "--listener-metrics-addr=0"
        - "--listener-metrics-endpoint="
        - "--metrics-addr=0"
        command:
        - "/manager"
        env:
        - name: CONTROLLER_MANAGER_CONTAINER_IMAGE
          value: "ghcr.io/actions/gha-runner-scale-set-controller:0.6.1"
        - name: CONTROLLER_MANAGER_POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: CONTROLLER_MANAGER_LISTENER_IMAGE_PULL_POLICY
          value: "IfNotPresent"
        volumeMounts:
        - mountPath: /tmp
          name: tmp
      terminationGracePeriodSeconds: 10
      volumes:
      - name: tmp
        emptyDir: {}

And here is the same for my runner scale set:
---
# Source: gha-runner-scale-set/templates/no_permission_serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: arc-runner-set-gha-rs-no-permission
  namespace: arc-runners
  labels:
    helm.sh/chart: gha-rs-0.6.1
    app.kubernetes.io/name: arc-runner-set
    app.kubernetes.io/instance: arc-runner-set
    app.kubernetes.io/version: "0.6.1"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/part-of: gha-rs
    actions.github.com/scale-set-name: arc-runner-set
    actions.github.com/scale-set-namespace: arc-runners
  finalizers:
    - actions.github.com/cleanup-protection
---
# Source: gha-runner-scale-set/templates/githubsecret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: arc-runner-set-gha-rs-github-secret
  namespace: arc-runners
  labels:
    helm.sh/chart: gha-rs-0.6.1
    app.kubernetes.io/name: arc-runner-set
    app.kubernetes.io/instance: arc-runner-set
    app.kubernetes.io/version: "0.6.1"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/part-of: gha-rs
    actions.github.com/scale-set-name: arc-runner-set
    actions.github.com/scale-set-namespace: arc-runners
  finalizers:
    - actions.github.com/cleanup-protection
data:
  github_token: Z2hwX0x1bHY0WWFOMEwwZGt1d0NqbXZSM0ZtWW1ZS2VUUjMyRnhyTQ==
---
# Source: gha-runner-scale-set/templates/manager_role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: arc-runner-set-gha-rs-manager
  namespace: arc-runners
  labels:
    helm.sh/chart: gha-rs-0.6.1
    app.kubernetes.io/name: arc-runner-set
    app.kubernetes.io/instance: arc-runner-set
    app.kubernetes.io/version: "0.6.1"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/part-of: gha-rs
    actions.github.com/scale-set-name: arc-runner-set
    actions.github.com/scale-set-namespace: arc-runners
    app.kubernetes.io/component: manager-role
  finalizers:
    - actions.github.com/cleanup-protection
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - create
  - delete
  - get
- apiGroups:
  - ""
  resources:
  - pods/status
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - secrets
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
- apiGroups:
  - ""
  resources:
  - serviceaccounts
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
- apiGroups:
  - rbac.authorization.k8s.io
  resources:
  - rolebindings
  verbs:
  - create
  - delete
  - get
  - patch
  - update
- apiGroups:
  - rbac.authorization.k8s.io
  resources:
  - roles
  verbs:
  - create
  - delete
  - get
  - patch
  - update
---
# Source: gha-runner-scale-set/templates/manager_role_binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: arc-runner-set-gha-rs-manager
  namespace: arc-runners
  labels:
    helm.sh/chart: gha-rs-0.6.1
    app.kubernetes.io/name: arc-runner-set
    app.kubernetes.io/instance: arc-runner-set
    app.kubernetes.io/version: "0.6.1"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/part-of: gha-rs
    actions.github.com/scale-set-name: arc-runner-set
    actions.github.com/scale-set-namespace: arc-runners
    app.kubernetes.io/component: manager-role-binding
  finalizers:
    - actions.github.com/cleanup-protection
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: arc-runner-set-gha-rs-manager
subjects:
- kind: ServiceAccount
  name: 
    arc-gha-rs-controller
  namespace: 
    arc-systems
---
# Source: gha-runner-scale-set/templates/autoscalingrunnerset.yaml
apiVersion: actions.github.com/v1alpha1
kind: AutoscalingRunnerSet
metadata:
  name: arc-runner-set
  namespace: arc-runners
  labels:
    app.kubernetes.io/component: "autoscaling-runner-set"
    helm.sh/chart: gha-rs-0.6.1
    app.kubernetes.io/name: arc-runner-set
    app.kubernetes.io/instance: arc-runner-set
    app.kubernetes.io/version: "0.6.1"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/part-of: gha-rs
    actions.github.com/scale-set-name: arc-runner-set
    actions.github.com/scale-set-namespace: arc-runners
  annotations:
    actions.github.com/cleanup-github-secret-name: arc-runner-set-gha-rs-github-secret
    actions.github.com/cleanup-manager-role-binding: arc-runner-set-gha-rs-manager
    actions.github.com/cleanup-manager-role-name: arc-runner-set-gha-rs-manager
    actions.github.com/cleanup-no-permission-service-account-name: arc-runner-set-gha-rs-no-permission
spec:
  githubConfigUrl: https://github.com/Gallery-Media-Group
  githubConfigSecret: arc-runner-set-gha-rs-github-secret
  maxRunners: 10
  minRunners: 2

  template:
    spec:
      restartPolicy: Never
      serviceAccountName: arc-runner-set-gha-rs-no-permission
      initContainers:
      - command:
        - cp
        - -r
        - -v
        - /home/runner/externals/.
        - /home/runner/tmpDir/
        image: ghcr.io/actions/actions-runner:latest
        name: init-dind-externals
        volumeMounts:
        - mountPath: /home/runner/tmpDir
          name: dind-externals
      containers:

      - name: runner
        command: 
          - /home/runner/run.sh
        image: 
          CUSTOM_IMAGE
        env:
          - 
            name: DOCKER_HOST
            value: unix:///run/docker/docker.sock
        volumeMounts:
          - 
            mountPath: /home/runner/_work
            name: work
          - 
            mountPath: /run/docker
            name: dind-sock
            readOnly: true
      - 
        args:
        - dockerd
        - --host=unix:///run/docker/docker.sock
        - --group=$(DOCKER_GROUP_GID)
        env:
        - name: DOCKER_GROUP_GID
          value: "123"
        image: docker:dind
        name: dind
        securityContext:
          privileged: true
        volumeMounts:
        - mountPath: /home/runner/_work
          name: work
        - mountPath: /run/docker
          name: dind-sock
        - mountPath: /home/runner/externals
          name: dind-externals
      volumes:
      - emptyDir: {}
        name: work
      - emptyDir: {}
        name: dind-sock
      - emptyDir: {}
        name: dind-externals
k-walsh-gmg commented 11 months ago

Sorry for the giant font! It seems to have just came through that way!

k-walsh-gmg commented 11 months ago

And just for continuity purposes, this is the values file the runner scale set was built off of(which should work for custom images according to the above link):

## githubConfigUrl is the GitHub url for where you want to configure runners
## ex: https://github.com/myorg/myrepo or https://github.com/myorg
githubConfigUrl: "https://github.com/Gallery-Media-Group"

## githubConfigSecret is the k8s secrets to use when auth with GitHub API.
## You can choose to use GitHub App or a PAT token
githubConfigSecret:
  ### GitHub Apps Configuration
  ## NOTE: IDs MUST be strings, use quotes
  #github_app_id: ""
  #github_app_installation_id: ""
  #github_app_private_key: |

  ### GitHub PAT Configuration
  github_token: ""
## If you have a pre-define Kubernetes secret in the same namespace the gha-runner-scale-set is going to deploy,
## you can also reference it via `githubConfigSecret: pre-defined-secret`.
## You need to make sure your predefined secret has all the required secret data set properly.
##   For a pre-defined secret using GitHub PAT, the secret needs to be created like this:
##   > kubectl create secret generic pre-defined-secret --namespace=my_namespace --from-literal=github_token='ghp_your_pat'
##   For a pre-defined secret using GitHub App, the secret needs to be created like this:
##   > kubectl create secret generic pre-defined-secret --namespace=my_namespace --from-literal=github_app_id=123456 --from-literal=github_app_installation_id=654321 --from-literal=github_app_private_key='-----BEGIN CERTIFICATE-----*******'
# githubConfigSecret: pre-defined-secret

## proxy can be used to define proxy settings that will be used by the
## controller, the listener and the runner of this scale set.
#
# proxy:
#   http:
#     url: http://proxy.com:1234
#     credentialSecretRef: proxy-auth # a secret with `username` and `password` keys
#   https:
#     url: http://proxy.com:1234
#     credentialSecretRef: proxy-auth # a secret with `username` and `password` keys
#   noProxy:
#     - example.com
#     - example.org

# maxRunners is the max number of runners the autoscaling runner set will scale up to.
maxRunners: 10

# minRunners is the min number of runners the autoscaling runner set will scale down to.
minRunners: 2

# runnerGroup: "default"

## name of the runner scale set to create.  Defaults to the helm release name
# runnerScaleSetName: ""

## A self-signed CA certificate for communication with the GitHub server can be
## provided using a config map key selector. If `runnerMountPath` is set, for
## each runner pod ARC will:
## - create a `github-server-tls-cert` volume containing the certificate
##   specified in `certificateFrom`
## - mount that volume on path `runnerMountPath`/{certificate name}
## - set NODE_EXTRA_CA_CERTS environment variable to that same path
## - set RUNNER_UPDATE_CA_CERTS environment variable to "1" (as of version
##   2.303.0 this will instruct the runner to reload certificates on the host)
##
## If any of the above had already been set by the user in the runner pod
## template, ARC will observe those and not overwrite them.
## Example configuration:
#
# githubServerTLS:
#   certificateFrom:
#     configMapKeyRef:
#       name: config-map-name
#       key: ca.crt
#   runnerMountPath: /usr/local/share/ca-certificates/

## Container mode is an object that provides out-of-box configuration
## for dind and kubernetes mode. Template will be modified as documented under the
## template object.
##
## If any customization is required for dind or kubernetes mode, containerMode should remain
## empty, and configuration should be applied to the template.
containerMode:
  # type: "dind"  ## type can be set to dind or kubernetes
#   ## the following is required when containerMode.type=kubernetes
#   kubernetesModeWorkVolumeClaim:
#     accessModes: ["ReadWriteOnce"]
#     # For local testing, use https://github.com/openebs/dynamic-localpv-provisioner/blob/develop/docs/quickstart.md to provide dynamic provision volume with storageClassName: openebs-hostpath
#     storageClassName: "dynamic-blob-storage"
#     resources:
#       requests:
#         storage: 1Gi

## template is the PodSpec for each listener Pod
## For reference: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#PodSpec
# listenerTemplate:
#   spec:
#     containers:
#     # Use this section to append additional configuration to the listener container.
#     # If you change the name of the container, the configuration will not be applied to the listener,
#     # and it will be treated as a side-car container.
#     - name: listener
#       securityContext:
#         runAsUser: 1000
#     # Use this section to add the configuration of a side-car container.
#     # Comment it out or remove it if you don't need it.
#     # Spec for this container will be applied as is without any modifications.
#     - name: side-car
#       image: example-sidecar

## template is the PodSpec for each runner Pod
## For reference: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#PodSpec
template:
  # template.spec will be modified if you change the container mode
  # with containerMode.type=dind, we will populate the template.spec with following pod spec
  template:
    spec:
      initContainers:
      - name: init-dind-externals
        image: ghcr.io/actions/actions-runner:latest
        command: ["cp", "-r", "-v", "/home/runner/externals/.", "/home/runner/tmpDir/"]
        volumeMounts:
          - name: dind-externals
            mountPath: /home/runner/tmpDir
      containers:
      - name: runner
        image: CUSTOM_IMAGE
        command: ["/home/runner/run.sh"]
        env:
          - name: DOCKER_HOST
            value: unix:///run/docker/docker.sock
        volumeMounts:
          - name: work
            mountPath: /home/runner/_work
          - name: dind-sock
            mountPath: /run/docker
            readOnly: true
      - name: dind
        image: docker:dind
        args:
          - dockerd
          - --host=unix:///run/docker/docker.sock
          - --group=$(DOCKER_GROUP_GID)
        env:
          - name: DOCKER_GROUP_GID
            value: "123"
        securityContext:
          privileged: true
        volumeMounts:
          - name: work
            mountPath: /home/runner/_work
          - name: dind-sock
            mountPath: /run/docker
          - name: dind-externals
            mountPath: /home/runner/externals
      volumes:
      - name: work
        emptyDir: {}
      - name: dind-sock
        emptyDir: {}
      - name: dind-externals
        emptyDir: {}
  ######################################################################################################
  ## with containerMode.type=kubernetes, we will populate the template.spec with following pod spec
  ## template:
  ##   spec:
  ##     containers:
  ##     - name: runner
  ##       image: ghcr.io/actions/actions-runner:latest
  ##       command: ["/home/runner/run.sh"]
  ##       env:
  ##         - name: ACTIONS_RUNNER_CONTAINER_HOOKS
  ##           value: /home/runner/k8s/index.js
  ##         - name: ACTIONS_RUNNER_POD_NAME
  ##           valueFrom:
  ##             fieldRef:
  ##               fieldPath: metadata.name
  ##         - name: ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER
  ##           value: "true"
  ##       volumeMounts:
  ##         - name: work
  ##           mountPath: /home/runner/_work
  ##     volumes:
  ##       - name: work
  ##         ephemeral:
  ##           volumeClaimTemplate:
  ##             spec:
  ##               accessModes: [ "ReadWriteOnce" ]
  ##               storageClassName: "local-path"
  ##               resources:
  ##                 requests:
  ##                   storage: 1Gi
  spec:
    initContainers:
    - name: init-dind-externals
      image: ghcr.io/actions/actions-runner:latest
      command: ["cp", "-r", "-v", "/home/runner/externals/.", "/home/runner/tmpDir/"]
      volumeMounts:
        - name: dind-externals
          mountPath: /home/runner/tmpDir
    containers:
    - name: runner
      image: CUSTOM_IMAGE
      command: ["/home/runner/run.sh"]
      env:
        - name: DOCKER_HOST
          value: unix:///run/docker/docker.sock
      volumeMounts:
        - name: work
          mountPath: /home/runner/_work
        - name: dind-sock
          mountPath: /run/docker
          readOnly: true
    - name: dind
      image: docker:dind
      args:
        - dockerd
        - --host=unix:///run/docker/docker.sock
        - --group=$(DOCKER_GROUP_GID)
      env:
        - name: DOCKER_GROUP_GID
          value: "123"
      securityContext:
        privileged: true
      volumeMounts:
        - name: work
          mountPath: /home/runner/_work
        - name: dind-sock
          mountPath: /run/docker
        - name: dind-externals
          mountPath: /home/runner/externals
    volumes:
    - name: work
      emptyDir: {}
    - name: dind-sock
      emptyDir: {}
    - name: dind-externals
      emptyDir: {}

## Optional controller service account that needs to have required Role and RoleBinding
## to operate this gha-runner-scale-set installation.
## The helm chart will try to find the controller deployment and its service account at installation time.
## In case the helm chart can't find the right service account, you can explicitly pass in the following value
## to help it finish RoleBinding with the right service account.
## Note: if your controller is installed to only watch a single namespace, you have to pass these values explicitly.
controllerServiceAccount:
  namespace: arc-systems
  name: arc-gha-rs-controller
k-walsh-gmg commented 11 months ago

Upon inspecting the AutoscalingRunnerSet I am seeing some significant yaml formatting issues.

k-walsh-gmg commented 11 months ago

Also attempted manually "fixing" the yaml which produced the same error of pods starting and stopping. Investigated the events since I cannot get any cluster logs and it shows that the DIND container is already attached, I removed the init and DIND containers from the spec and CNI plugin said there aren't any IP addresses available(There are a significant amount available after checking). I am able to install anything else just this using a custom image(sourced from ghcr.io/actions/actions-runner:latest). Also noting that the stock helm deploy in DIND mode seems to work just fine, just not with a custom container.

wherka-ama commented 11 months ago

@k-walsh-gmg : it would be easier to attach the output as a log file. However, that's fine. I'm a bit confused as I don't fully understand your intention.

If it's about influencing the controller, then you can set the image by passing the values.yaml as follows:

image:
  repository: "myrepo/my_image_for_gha-runner-scale-set-controller"
  pullPolicy: IfNotPresent
  # Overrides the image tag whose default is the chart appVersion.
  tag: "your_tag"

However, you need to be careful as you custom image of the controller needs to be a drop-in replacement with all the entrypoint(/manager) preserved and the right version of the software inside. It is also worth mentioning that this image is derived from gcr.io/distroless/static:nonroot. The implication of such strategy is that is doesn't have shell, so it is not possible to attach the tty i.e. run the kubectl exec on it and debug stuff. It's light and secure, but not necessarily flexible and DEV-friendly.

Now, you also mentioned AutoscalingRunnerSet => this is a CRD and it doesn't really come to the picture as an actual POD when we're talking about the controller chart. This chart only ships the CRD definitions which are installed on the cluster.

If you say - init container, runner pod spec - that's all about the second helm chart - it's documented here: https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners-with-actions-runner-controller/quickstart-for-actions-runner-controller#configuring-a-runner-scale-set

I think we are mixing several things here.

For customizing the runner pod template you need to use the values.yaml as follows(it's a bit simplified and not necessarily generic):

## template is the PodSpec for each runner Pod
template:
  spec:
    containers:
    - name: runner
      image: myrepo/my_image_for_runner
      securityContext:
        privileged: true
        runAsUser: 1001
        runAsGroup: 0
      env:
        - name: DOCKER_CLIENT_ADDRESS
          value: unix:///run/user/1001/docker.sock
      volumeMounts:
        - name: work
          mountPath: /home/runner/_work
      resources:
        requests:
          cpu: 100m
          memory: 100Mi
        limits:
          cpu: 1
          memory: 4Gi
    volumes:
    - name: work
      emptyDir: {}

This time round the image is more forgiving, but still - there are some conventions and if you don't mimic the official images in that regard, the chances of failure are high.

Another context is the DinD. This is a mechanism which allows you to pretty much forget about the customization of your runner images. What happens is you have a generic image with docker daemon and you can run actions in docker i.e. you can specify the image ref or the dockerfile in this mode and you a generic runner talks to that docker daemon to start the container from your spec. That's it. There is hardly ever need to customise it, unless you have a problem with reaching the official repositories.

Here it's an example of such case - values.yaml for runner-scale-set again:

## template is the PodSpec for each runner Pod
template:
  spec:
    containers:
    - name: runner
      # image: ghcr.io/actions/actions-runner:latest
      image: myrepo/my_image_for_runner
      command: ["/home/runner/run.sh"]
      securityContext:
        privileged: true
        runAsUser: 1001
        runAsGroup: 0
      env:
        - name: DOCKER_HOST
          value: tcp://localhost:2376
        - name: DOCKER_TLS_VERIFY
          value: "1"
        - name: DOCKER_CERT_PATH
          value: /certs/client
      volumeMounts:
        - name: work
          mountPath: /home/runner/_work
        - name: dind-cert
          mountPath: /certs/client
          readOnly: true
      resources:
        requests:
          cpu: 100m
          memory: 100Mi
        limits:
          cpu: 1
          memory: 2Gi
    - name: dind
      # image: docker:dind
      image: myrepo/my_image_for_dind
      volumeMounts:
        - name: work
          mountPath: /home/runner/_work
        - name: dind-cert
          mountPath: /certs/client
        - name: dind-externals
          mountPath: /home/runner/externals
      resources:
        requests:
          cpu: 100m
          memory: 100Mi
        limits:
          cpu: 1
          memory: 8Gi
    volumes:
    - name: work
      emptyDir: {}
    - name: dind-cert
      emptyDir: {}
    - name: dind-externals
      emptyDir: {}

Nevertheless, I'm yet to understand your problem in full.

I admit that navigating in ARC is not super simple if you are not very experienced in this are - I'm not suggesting you are in this position and I'm not judging you. I spent some time on figuring these things out myself ;-)

Hope it helps.

Anyway, I think there is nothing wrong with ARC per se, and this issue is a non-issue really. Unless someone from the ARC team say otherwise and when we get a bit fuller picture from your side. Right now it's a bit blurry I'm afraid.

k-walsh-gmg commented 11 months ago

Hello @wherka-ama ! Thanks so much for taking time here, I have built a custom docker image with cli tools that we usually use in our workflows (so they will not have to be installed each time we run a workflow), and have attempted attaching it to the DIND container as well as the runner to no success. Just to clarify I am not modifying the controllers image just the image for the runners/dind which would actually run the workflow(which is the values file I presented from here: https://github.com/actions/actions-runner-controller/tree/master/charts/gha-runner-scale-set). So a break down of the issues I have encountered thus far:

  1. In the runner scale set helm values file I attempted to add a custom docker image after hashing out the selection of DIND mode in the values, and populating with the stock DIND template(located in the values file for the runner scale set[not the controller]) spec with the runners container image being a custom one(base image is sourced from ghcr.io/actions/actions-runner:latest which is what I thought the stock containers mostly use but I could be wrong) . Result is that the pods spin up and down(with events showing the CNI plugin showing a lack of IP addresses).

  2. With a similar setup I removed all of the other containers excluding the runner and set the image for the runner to my custom. This caused the DIND init container to attempt to pull in my image as well which causes pod failure.

  3. Using the template generated by helm I corrected the spacing errors that were generated and deployed via kubectl but this replicated the exhausted IP addresses issue(which there is most definitely a surplus of IP addresses in my cluster and other installs continue to work fine).

I am wondering if it should be attached to the dind container for this use case but have attempted that with a custom dind container(built from the one in the values yaml) to no avail. It would seem that any custom image I attempt to attach fails or is overridden via templates

wherka-ama commented 11 months ago

So, ghcr.io/actions/actions-runner:latest is a stock image, but definitely not for the dind. It's fine for a runner as well as for initContainer, but not for the dind. The sole purpose of the dind container in this setup is to act as a side container which facilitates the docker daemon. As for the initContainer - it's just to prepare the context for the dind, so it's got the bits from the runner SDK necessary to mount on each container spinned by the action which is supposed to run in docker context.

I'm not sure if this part of the architecture of ARC is well explained to be honest. ARC started as a community effort where many people where very familiar with internals as they grew together with the project. It's not so straightforward to grasp the full picture for someone who is just starting with ARC - even for people very familiar with k8s and its landscape.

k-walsh-gmg commented 11 months ago

Ah got ya, yeah I am testing some things here now but will let you know if I find a work around

nikola-jokic commented 11 months ago

Hey,

I agree with @wherka-ama, this is not really an issue, but it can be difficult to understand how to configure ARC properly. Thank you so much for answering this issue @wherka-ama!

I will close this issue now but @k-walsh-gmg, feel free to comment on it :relaxed:

k-walsh-gmg commented 11 months ago

Hello all! So my issue arose from the second "template" field which is hashed out for the dind configuration. I left the original one hashed to correct, but I think having it twice(currently the default values file has one hashed and one not right right above) can be a bit confusing.

k-walsh-gmg commented 11 months ago

In particular right here: `template:

template.spec will be modified if you change the container mode

with containerMode.type=dind, we will populate the template.spec with following pod spec

template:`