actions / actions-runner-controller

Kubernetes controller for GitHub Actions self-hosted runners
Apache License 2.0
4.56k stars 1.07k forks source link

pod in NotReady state for several minutes after job completes (v0.21.0 -> v0.22.0 upgrade) #1291

Closed nicholasgibson2 closed 2 years ago

nicholasgibson2 commented 2 years ago

After a job completes the runner container exits cleanly, but the pod stays in READY: 1/2 STATUS: NotReady for 10-20 minutes. During this time the runner does not show up in GitHub runners list. This issue does not happen with version v0.21.0

Checks

To Reproduce Steps to reproduce the behavior:

  1. run a GitHub actions job
  2. kubectl get pods -n {runner namespace}

Expected behavior Pods should come back up immediately after job completes and container exits.

Environment (please complete the following information):

Additional context

RunnerDeployment configuration:

apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: aks-runner
  namespace: github
spec:
  replicas: 2
  template:
    spec:
      organization: XXX
      image: my-custom-runner-image:latest
      imagePullPolicy: Always
      imagePullSecrets:
      - name: image-secret

controller logs (while NotReady):

2022-03-30T19:25:09Z    INFO    controller-runtime.metrics  Metrics server is starting to listen    {"addr": "127.0.0.1:8080"}
2022-03-30T19:25:09Z    INFO    actions-runner-controller   Initializing actions-runner-controller  {"github-api-cache-duration": "9m50s", "sync-period": "10m0s", "runner-image": "summerwind/actions-runner:latest", "docker-image": "docker:dind", "common-runnner-labels": null, "watch-namespace": ""}
2022-03-30T19:25:09Z    INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "actions.summerwind.dev/v1alpha1, Kind=Runner", "path": "/mutate-actions-summerwind-dev-v1alpha1-runner"}
2022-03-30T19:25:09Z    INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-actions-summerwind-dev-v1alpha1-runner"}
2022-03-30T19:25:09Z    INFO    controller-runtime.builder  Registering a validating webhook    {"GVK": "actions.summerwind.dev/v1alpha1, Kind=Runner", "path": "/validate-actions-summerwind-dev-v1alpha1-runner"}
2022-03-30T19:25:09Z    INFO    controller-runtime.webhook  Registering webhook {"path": "/validate-actions-summerwind-dev-v1alpha1-runner"}
2022-03-30T19:25:09Z    INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerDeployment", "path": "/mutate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2022-03-30T19:25:09Z    INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2022-03-30T19:25:09Z    INFO    controller-runtime.builder  Registering a validating webhook    {"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerDeployment", "path": "/validate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2022-03-30T19:25:09Z    INFO    controller-runtime.webhook  Registering webhook {"path": "/validate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2022-03-30T19:25:09Z    INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerReplicaSet", "path": "/mutate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2022-03-30T19:25:09Z    INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2022-03-30T19:25:09Z    INFO    controller-runtime.builder  Registering a validating webhook    {"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerReplicaSet", "path": "/validate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2022-03-30T19:25:09Z    INFO    controller-runtime.webhook  Registering webhook {"path": "/validate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2022-03-30T19:25:09Z    INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-runner-set-pod"}
2022-03-30T19:25:09Z    INFO    actions-runner-controller   starting manager
2022-03-30T19:25:09Z    INFO    controller-runtime.webhook.webhooks Starting webhook server
2022-03-30T19:25:09Z    INFO    controller-runtime.certwatcher  Updated current TLS certificate
2022-03-30T19:25:09Z    INFO    Starting server {"path": "/metrics", "kind": "metrics", "addr": "127.0.0.1:8080"}
2022-03-30T19:25:09Z    INFO    controller-runtime.webhook  Serving webhook server  {"host": "", "port": 9443}
2022-03-30T19:25:09Z    INFO    controller-runtime.certwatcher  Starting certificate watcher
I0330 19:25:09.946480       1 leaderelection.go:248] attempting to acquire leader lease actions-runner-system/actions-runner-controller...
I0330 19:25:27.209232       1 leaderelection.go:258] successfully acquired lease actions-runner-system/actions-runner-controller
2022-03-30T19:25:27Z    DEBUG   events  Normal  {"object": {"kind":"ConfigMap","namespace":"actions-runner-system","name":"actions-runner-controller","uid":"19f41ab2-cd55-4dc4-adda-ffef05653cd2","apiVersion":"v1","resourceVersion":"3087774"}, "reason": "LeaderElection", "message": "actions-runner-controller-7f66786b86-dtx95_7abc07f1-8ab3-49c5-a625-47481d8f0c45 became leader"}
2022-03-30T19:25:27Z    DEBUG   events  Normal  {"object": {"kind":"Lease","namespace":"actions-runner-system","name":"actions-runner-controller","uid":"d52a5fcf-40da-4ce8-8bbf-639160009bf6","apiVersion":"coordination.k8s.io/v1","resourceVersion":"3087775"}, "reason": "LeaderElection", "message": "actions-runner-controller-7f66786b86-dtx95_7abc07f1-8ab3-49c5-a625-47481d8f0c45 became leader"}
2022-03-30T19:25:27Z    INFO    controller.runner-controller    Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "Runner", "source": "kind source: *v1alpha1.Runner"}
2022-03-30T19:25:27Z    INFO    controller.runner-controller    Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "Runner", "source": "kind source: *v1.Pod"}
2022-03-30T19:25:27Z    INFO    controller.runner-controller    Starting Controller {"reconciler group": "actions.summerwind.dev", "reconciler kind": "Runner"}
2022-03-30T19:25:27Z    INFO    controller.runnerdeployment-controller  Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerDeployment", "source": "kind source: *v1alpha1.RunnerDeployment"}
2022-03-30T19:25:27Z    INFO    controller.runnerpod-controller Starting EventSource    {"reconciler group": "", "reconciler kind": "Pod", "source": "kind source: *v1.Pod"}
2022-03-30T19:25:27Z    INFO    controller.runnerpod-controller Starting Controller {"reconciler group": "", "reconciler kind": "Pod"}
2022-03-30T19:25:27Z    INFO    controller.runnerreplicaset-controller  Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerReplicaSet", "source": "kind source: *v1alpha1.RunnerReplicaSet"}
2022-03-30T19:25:27Z    INFO    controller.runnerreplicaset-controller  Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerReplicaSet", "source": "kind source: *v1alpha1.Runner"}
2022-03-30T19:25:27Z    INFO    controller.runnerreplicaset-controller  Starting Controller {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerReplicaSet"}
2022-03-30T19:25:27Z    INFO    controller.runnerdeployment-controller  Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerDeployment", "source": "kind source: *v1alpha1.RunnerReplicaSet"}
2022-03-30T19:25:27Z    INFO    controller.runnerdeployment-controller  Starting Controller {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerDeployment"}
2022-03-30T19:25:27Z    INFO    controller.runnerset-controller Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerSet", "source": "kind source: *v1alpha1.RunnerSet"}
2022-03-30T19:25:27Z    INFO    controller.runnerset-controller Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerSet", "source": "kind source: *v1.StatefulSet"}
2022-03-30T19:25:27Z    INFO    controller.runnerset-controller Starting Controller {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerSet"}
2022-03-30T19:25:27Z    INFO    controller.horizontalrunnerautoscaler-controller    Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "HorizontalRunnerAutoscaler", "source": "kind source: *v1alpha1.HorizontalRunnerAutoscaler"}
2022-03-30T19:25:27Z    INFO    controller.horizontalrunnerautoscaler-controller    Starting Controller {"reconciler group": "actions.summerwind.dev", "reconciler kind": "HorizontalRunnerAutoscaler"}
2022-03-30T19:25:27Z    INFO    controller.runnerreplicaset-controller  Starting workers    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerReplicaSet", "worker count": 1}
2022-03-30T19:25:27Z    INFO    controller.runnerset-controller Starting workers    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerSet", "worker count": 1}
2022-03-30T19:25:27Z    INFO    controller.runner-controller    Starting workers    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "Runner", "worker count": 1}
2022-03-30T19:25:27Z    INFO    controller.runnerdeployment-controller  Starting workers    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerDeployment", "worker count": 1}
2022-03-30T19:25:27Z    INFO    controller.horizontalrunnerautoscaler-controller    Starting workers    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "HorizontalRunnerAutoscaler", "worker count": 1}
2022-03-30T19:25:27Z    INFO    controller.runnerpod-controller Starting workers    {"reconciler group": "", "reconciler kind": "Pod", "worker count": 1}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runnerdeployment", "UID": "f87fae1d-f823-433c-b506-37cf7e4fd293", "kind": "actions.summerwind.dev/v1alpha1, Kind=RunnerDeployment", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runnerdeployments"}}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runnerdeployment", "code": 200, "reason": "", "UID": "f87fae1d-f823-433c-b506-37cf7e4fd293", "allowed": true}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runnerdeployment", "UID": "be4023bb-ec9b-41de-a1d6-f885726438cd", "kind": "actions.summerwind.dev/v1alpha1, Kind=RunnerDeployment", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runnerdeployments"}}
2022-03-30T19:25:39Z    INFO    runnerdeployment-resource   validate resource to be created {"name": "aks-runner"}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runnerdeployment", "code": 200, "reason": "", "UID": "be4023bb-ec9b-41de-a1d6-f885726438cd", "allowed": true}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runnerreplicaset", "UID": "430c24ff-2d36-4fc3-9e9a-76a892deb37d", "kind": "actions.summerwind.dev/v1alpha1, Kind=RunnerReplicaSet", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runnerreplicasets"}}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runnerreplicaset", "code": 200, "reason": "", "UID": "430c24ff-2d36-4fc3-9e9a-76a892deb37d", "allowed": true}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runnerreplicaset", "UID": "e068eb4e-b5e1-49a3-a4ea-012e9c045fab", "kind": "actions.summerwind.dev/v1alpha1, Kind=RunnerReplicaSet", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runnerreplicasets"}}
2022-03-30T19:25:39Z    INFO    runnerreplicaset-resource   validate resource to be created {"name": "aks-runner-dv722"}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runnerreplicaset", "code": 200, "reason": "", "UID": "e068eb4e-b5e1-49a3-a4ea-012e9c045fab", "allowed": true}
2022-03-30T19:25:39Z    INFO    actions-runner-controller.runnerdeployment  Created runnerreplicaset    {"runnerdeployment": "github/aks-runner", "runnerreplicaset": "aks-runner-dv722"}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "35bd9927-37ef-4838-888b-8dec6c8b8c41", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "35bd9927-37ef-4838-888b-8dec6c8b8c41", "allowed": true}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "aa695ed8-fb2a-41e9-9746-08fc48ca2689", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-03-30T19:25:39Z    INFO    runner-resource validate resource to be created {"name": "aks-runner-dv722-tjvrq"}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "aa695ed8-fb2a-41e9-9746-08fc48ca2689", "allowed": true}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "8af470e9-0f47-4522-b85d-fccb7e485ab8", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "8af470e9-0f47-4522-b85d-fccb7e485ab8", "allowed": true}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "e11fcba2-762b-4e6a-b52d-c5afac98b3b0", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "e11fcba2-762b-4e6a-b52d-c5afac98b3b0", "allowed": true}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "81f9d82b-575d-4429-bac3-d03b1d96753c", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-03-30T19:25:39Z    INFO    runner-resource validate resource to be updated {"name": "aks-runner-dv722-tjvrq"}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "81f9d82b-575d-4429-bac3-d03b1d96753c", "allowed": true}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "3bda37af-f717-42ef-84e7-ff610206a915", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-03-30T19:25:39Z    INFO    runner-resource validate resource to be created {"name": "aks-runner-dv722-8sbzn"}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "3bda37af-f717-42ef-84e7-ff610206a915", "allowed": true}
2022-03-30T19:25:39Z    DEBUG   actions-runner-controller.runnerreplicaset  Created replica(s)  {"runnerreplicaset": "github/aks-runner-dv722", "lastSyncTime": null, "effectiveTime": "<nil>", "templateHashDesired": "595777c875", "replicasDesired": 2, "replicasPending": 0, "replicasRunning": 0, "replicasMaybeRunning": 0, "templateHashObserved": [], "created": 2}
2022-03-30T19:25:39Z    DEBUG   actions-runner-controller.runnerreplicaset  Skipped reconcilation because owner is not synced yet   {"runnerreplicaset": "github/aks-runner-dv722", "owner": "github/aks-runner-dv722-tjvrq", "pods": null}
2022-03-30T19:25:39Z    INFO    actions-runner-controller.runner    Updated registration token  {"runner": "aks-runner-dv722-tjvrq", "repository": ""}
2022-03-30T19:25:39Z    DEBUG   events  Normal  {"object": {"kind":"Runner","namespace":"github","name":"aks-runner-dv722-tjvrq","uid":"a770bb7c-6fc7-430d-9f7f-ece337f3eaef","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"3087860"}, "reason": "RegistrationTokenUpdated", "message": "Successfully update registration token"}
2022-03-30T19:25:39Z    DEBUG   actions-runner-controller.runnerreplicaset  Skipped reconcilation because owner is not synced yet   {"runnerreplicaset": "github/aks-runner-dv722", "owner": "github/aks-runner-dv722-tjvrq", "pods": null}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "7dd4eb05-9452-41e1-9f2a-b7a111baf4ef", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "7dd4eb05-9452-41e1-9f2a-b7a111baf4ef", "allowed": true}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "da8adf83-9a57-406b-a76b-3a7d274a594c", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-03-30T19:25:39Z    INFO    runner-resource validate resource to be updated {"name": "aks-runner-dv722-8sbzn"}
2022-03-30T19:25:39Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "da8adf83-9a57-406b-a76b-3a7d274a594c", "allowed": true}
2022-03-30T19:25:39Z    DEBUG   actions-runner-controller.runnerreplicaset  Skipped reconcilation because owner is not synced yet   {"runnerreplicaset": "github/aks-runner-dv722", "owner": "github/aks-runner-dv722-tjvrq", "pods": null}
2022-03-30T19:25:40Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/mutate-runner-set-pod", "UID": "043dc320-32d3-4da8-864d-904a0af2307d", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
2022-03-30T19:25:40Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/mutate-runner-set-pod", "code": 200, "reason": "", "UID": "043dc320-32d3-4da8-864d-904a0af2307d", "allowed": true}
2022-03-30T19:25:40Z    INFO    actions-runner-controller.runner    Created runner pod  {"runner": "github/aks-runner-dv722-tjvrq", "repository": ""}
2022-03-30T19:25:40Z    DEBUG   events  Normal  {"object": {"kind":"Runner","namespace":"github","name":"aks-runner-dv722-tjvrq","uid":"a770bb7c-6fc7-430d-9f7f-ece337f3eaef","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"3087862"}, "reason": "PodCreated", "message": "Created pod 'aks-runner-dv722-tjvrq'"}
2022-03-30T19:25:40Z    INFO    actions-runner-controller.runner    Updated registration token  {"runner": "aks-runner-dv722-8sbzn", "repository": ""}
2022-03-30T19:25:40Z    DEBUG   events  Normal  {"object": {"kind":"Runner","namespace":"github","name":"aks-runner-dv722-8sbzn","uid":"784be9dd-cb3b-4126-940e-609ec5c6dba9","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"3087865"}, "reason": "RegistrationTokenUpdated", "message": "Successfully update registration token"}
2022-03-30T19:25:40Z    DEBUG   actions-runner-controller.runnerreplicaset  Skipped reconcilation because owner is not synced yet   {"runnerreplicaset": "github/aks-runner-dv722", "owner": "github/aks-runner-dv722-tjvrq", "pods": [{"kind":"Pod","apiVersion":"v1","metadata":{"name":"aks-runner-dv722-tjvrq","namespace":"github","uid":"76cc58b0-07a7-4810-866c-401a998bf76a","resourceVersion":"3087870","creationTimestamp":"2022-03-30T19:25:40Z","labels":{"actions-runner-controller/inject-registration-token":"true","pod-template-hash":"78c748848","runner-deployment-name":"aks-runner","runner-template-hash":"595777c875","runnerset-name":"aks-runner-dv722-tjvrq"},"annotations":{"actions-runner-controller/token-expires-at":"2022-03-30T13:25:39-07:00","sync-time":"2022-03-30T19:25:39Z"},"ownerReferences":[{"apiVersion":"actions.summerwind.dev/v1alpha1","kind":"Runner","name":"aks-runner-dv722-tjvrq","uid":"a770bb7c-6fc7-430d-9f7f-ece337f3eaef","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"manager","operation":"Update","apiVersion":"v1","time":"2022-03-30T19:25:40Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:annotations":{".":{},"f:sync-time":{}},"f:labels":{".":{},"f:actions-runner-controller/inject-registration-token":{},"f:pod-template-hash":{},"f:runner-deployment-name":{},"f:runner-template-hash":{},"f:runnerset-name":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"a770bb7c-6fc7-430d-9f7f-ece337f3eaef\"}":{}}},"f:spec":{"f:containers":{"k:{\"name\":\"docker\"}":{".":{},"f:env":{".":{},"k:{\"name\":\"DOCKER_TLS_CERTDIR\"}":{".":{},"f:name":{},"f:value":{}}},"f:image":{},"f:imagePullPolicy":{},"f:name":{},"f:resources":{},"f:securityContext":{".":{},"f:privileged":{}},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{},"f:volumeMounts":{".":{},"k:{\"mountPath\":\"/certs/client\"}":{".":{},"f:mountPath":{},"f:name":{}},"k:{\"mountPath\":\"/runner\"}":{".":{},"f:mountPath":{},"f:name":{}},"k:{\"mountPath\":\"/runner/_work\"}":{".":{},"f:mountPath":{},"f:name":{}}}},"k:{\"name\":\"runner\"}":{".":{},"f:env":{".":{},"k:{\"name\":\"DOCKERD_IN_RUNNER\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"DOCKER_CERT_PATH\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"DOCKER_ENABLED\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"DOCKER_HOST\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"DOCKER_TLS_VERIFY\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"GITHUB_URL\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_ENTERPRISE\"}":{".":{},"f:name":{}},"k:{\"name\":\"RUNNER_EPHEMERAL\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_FEATURE_FLAG_EPHEMERAL\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_GROUP\"}":{".":{},"f:name":{}},"k:{\"name\":\"RUNNER_LABELS\"}":{".":{},"f:name":{}},"k:{\"name\":\"RUNNER_NAME\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_ORG\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_REPO\"}":{".":{},"f:name":{}},"k:{\"name\":\"RUNNER_TOKEN\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_WORKDIR\"}":{".":{},"f:name":{},"f:value":{}}},"f:image":{},"f:imagePullPolicy":{},"f:name":{},"f:resources":{},"f:securityContext":{".":{},"f:privileged":{}},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{},"f:volumeMounts":{".":{},"k:{\"mountPath\":\"/certs/client\"}":{".":{},"f:mountPath":{},"f:name":{},"f:readOnly":{}},"k:{\"mountPath\":\"/runner\"}":{".":{},"f:mountPath":{},"f:name":{}},"k:{\"mountPath\":\"/runner/_work\"}":{".":{},"f:mountPath":{},"f:name":{}}}}},"f:dnsPolicy":{},"f:enableServiceLinks":{},"f:imagePullSecrets":{".":{},"k:{\"name\":\"image-secret\"}":{}},"f:restartPolicy":{},"f:schedulerName":{},"f:securityContext":{},"f:terminationGracePeriodSeconds":{},"f:volumes":{".":{},"k:{\"name\":\"certs-client\"}":{".":{},"f:emptyDir":{},"f:name":{}},"k:{\"name\":\"runner\"}":{".":{},"f:emptyDir":{},"f:name":{}},"k:{\"name\":\"work\"}":{".":{},"f:emptyDir":{},"f:name":{}}}}}}]},"spec":{"volumes":[{"name":"runner","emptyDir":{}},{"name":"work","emptyDir":{}},{"name":"certs-client","emptyDir":{}},{"name":"kube-api-access-2l8tz","projected":{"sources":[{"serviceAccountToken":{"expirationSeconds":3607,"path":"token"}},{"configMap":{"name":"kube-root-ca.crt","items":[{"key":"ca.crt","path":"ca.crt"}]}},{"downwardAPI":{"items":[{"path":"namespace","fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}]}}],"defaultMode":420}}],"containers":[{"name":"runner","image":"XXX","env":[{"name":"RUNNER_ORG","value":"XXX"},{"name":"RUNNER_REPO"},{"name":"RUNNER_ENTERPRISE"},{"name":"RUNNER_LABELS"},{"name":"RUNNER_GROUP"},{"name":"DOCKER_ENABLED","value":"true"},{"name":"DOCKERD_IN_RUNNER","value":"false"},{"name":"GITHUB_URL","value":"https://github.com/"},{"name":"RUNNER_WORKDIR","value":"/runner/_work"},{"name":"RUNNER_EPHEMERAL","value":"true"},{"name":"DOCKER_HOST","value":"tcp://localhost:2376"},{"name":"DOCKER_TLS_VERIFY","value":"1"},{"name":"DOCKER_CERT_PATH","value":"/certs/client"},{"name":"RUNNER_FEATURE_FLAG_EPHEMERAL","value":"true"},{"name":"RUNNER_NAME","value":"aks-runner-dv722-tjvrq"},{"name":"RUNNER_TOKEN","value":"XXX"}],"resources":{},"volumeMounts":[{"name":"runner","mountPath":"/runner"},{"name":"work","mountPath":"/runner/_work"},{"name":"certs-client","readOnly":true,"mountPath":"/certs/client"},{"name":"kube-api-access-2l8tz","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"Always","securityContext":{"privileged":false}},{"name":"docker","image":"docker:dind","env":[{"name":"DOCKER_TLS_CERTDIR","value":"/certs"}],"resources":{},"volumeMounts":[{"name":"runner","mountPath":"/runner"},{"name":"certs-client","mountPath":"/certs/client"},{"name":"work","mountPath":"/runner/_work"},{"name":"kube-api-access-2l8tz","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent","securityContext":{"privileged":true}}],"restartPolicy":"OnFailure","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","serviceAccountName":"default","serviceAccount":"default","nodeName":"XXX","securityContext":{},"imagePullSecrets":[{"name":"image-secret"}],"schedulerName":"default-scheduler","tolerations":[{"key":"node.kubernetes.io/not-ready","operator":"Exists","effect":"NoExecute","tolerationSeconds":300},{"key":"node.kubernetes.io/unreachable","operator":"Exists","effect":"NoExecute","tolerationSeconds":300}],"priority":0,"enableServiceLinks":true,"preemptionPolicy":"PreemptLowerPriority"},"status":{"phase":"Pending","conditions":[{"type":"PodScheduled","status":"True","lastProbeTime":null,"lastTransitionTime":"2022-03-30T19:25:40Z"}],"qosClass":"BestEffort"}}]}
2022-03-30T19:25:40Z    DEBUG   actions-runner-controller.runnerreplicaset  Skipped reconcilation because owner is not synced yet   {"runnerreplicaset": "github/aks-runner-dv722", "owner": "github/aks-runner-dv722-8sbzn", "pods": null}
2022-03-30T19:25:40Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/mutate-runner-set-pod", "UID": "ee4e4dee-7984-4e90-b9e3-4df66c50f5f0", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
2022-03-30T19:25:40Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/mutate-runner-set-pod", "code": 200, "reason": "", "UID": "ee4e4dee-7984-4e90-b9e3-4df66c50f5f0", "allowed": true}
2022-03-30T19:25:40Z    INFO    actions-runner-controller.runner    Created runner pod  {"runner": "github/aks-runner-dv722-8sbzn", "repository": ""}
2022-03-30T19:25:40Z    DEBUG   events  Normal  {"object": {"kind":"Runner","namespace":"github","name":"aks-runner-dv722-8sbzn","uid":"784be9dd-cb3b-4126-940e-609ec5c6dba9","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"3087872"}, "reason": "PodCreated", "message": "Created pod 'aks-runner-dv722-8sbzn'"}
2022-03-30T19:25:41Z    DEBUG   actions-runner-controller.runner    Runner appears to have been registered and running. {"runner": "github/aks-runner-dv722-tjvrq", "podCreationTimestamp": "2022-03-30 19:25:40 +0000 UTC"}
2022-03-30T19:25:42Z    DEBUG   actions-runner-controller.runner    Runner appears to have been registered and running. {"runner": "github/aks-runner-dv722-8sbzn", "podCreationTimestamp": "2022-03-30 19:25:40 +0000 UTC"}

controller logs (once pod back in Ready state):

2022-03-30T19:43:32Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "51e0fe70-c47d-4d06-823f-7380e5de7cc1", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-03-30T19:43:32Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "f012aafc-b03c-46bc-b4be-db4372be4d24", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-03-30T19:43:32Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "f012aafc-b03c-46bc-b4be-db4372be4d24", "allowed": true}
2022-03-30T19:43:32Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "51e0fe70-c47d-4d06-823f-7380e5de7cc1", "allowed": true}
2022-03-30T19:43:32Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "1b21a26a-19b4-4700-b48b-8a0bf314d543", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-03-30T19:43:32Z    INFO    runner-resource validate resource to be created {"name": "aks-runner-dv722-7zssb"}
2022-03-30T19:43:32Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "1b21a26a-19b4-4700-b48b-8a0bf314d543", "allowed": true}
2022-03-30T19:43:32Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "f211c572-5dc9-4762-953e-47ef290308cd", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-03-30T19:43:32Z    INFO    runner-resource validate resource to be updated {"name": "aks-runner-dv722-8sbzn"}
2022-03-30T19:43:32Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "f211c572-5dc9-4762-953e-47ef290308cd", "allowed": true}
2022-03-30T19:43:32Z    DEBUG   actions-runner-controller.runnerreplicaset  Created replica(s)  {"runnerreplicaset": "github/aks-runner-dv722", "lastSyncTime": "2022-03-30T19:25:39Z", "effectiveTime": "<nil>", "templateHashDesired": "595777c875", "replicasDesired": 2, "replicasPending": 0, "replicasRunning": 1, "replicasMaybeRunning": 1, "templateHashObserved": ["595777c875"], "created": 1}
2022-03-30T19:43:32Z    INFO    actions-runner-controller.runner    Removed finalizer   {"runner": "github/aks-runner-dv722-8sbzn"}
2022-03-30T19:43:32Z    DEBUG   actions-runner-controller.runnerreplicaset  Skipped reconcilation because owner is not synced yet   {"runnerreplicaset": "github/aks-runner-dv722", "owner": "github/aks-runner-dv722-7zssb", "pods": null}
2022-03-30T19:43:32Z    DEBUG   actions-runner-controller.runnerreplicaset  Skipped reconcilation because owner is not synced yet   {"runnerreplicaset": "github/aks-runner-dv722", "owner": "github/aks-runner-dv722-7zssb", "pods": null}
2022-03-30T19:43:33Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "881a2fab-7a76-43f9-bd3c-2508119eb9dd", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-03-30T19:43:33Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "881a2fab-7a76-43f9-bd3c-2508119eb9dd", "allowed": true}
2022-03-30T19:43:33Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "f38bccf4-f102-4649-8f99-43ebcde4a828", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-03-30T19:43:33Z    INFO    runner-resource validate resource to be updated {"name": "aks-runner-dv722-7zssb"}
2022-03-30T19:43:33Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "f38bccf4-f102-4649-8f99-43ebcde4a828", "allowed": true}
2022-03-30T19:43:33Z    DEBUG   actions-runner-controller.runnerreplicaset  Skipped reconcilation because owner is not synced yet   {"runnerreplicaset": "github/aks-runner-dv722", "owner": "github/aks-runner-dv722-7zssb", "pods": null}
2022-03-30T19:43:33Z    INFO    actions-runner-controller.runnerpod Runner pod has been stopped with a successful status.   {"runnerpod": "github/aks-runner-dv722-8sbzn"}
2022-03-30T19:43:33Z    DEBUG   actions-runner-controller.runnerreplicaset  Skipped reconcilation because owner is not synced yet   {"runnerreplicaset": "github/aks-runner-dv722", "owner": "github/aks-runner-dv722-7zssb", "pods": null}
2022-03-30T19:43:33Z    INFO    actions-runner-controller.runner    Updated registration token  {"runner": "aks-runner-dv722-7zssb", "repository": ""}
2022-03-30T19:43:33Z    DEBUG   events  Normal  {"object": {"kind":"Runner","namespace":"github","name":"aks-runner-dv722-7zssb","uid":"6f85f2b4-2c28-4a0b-8415-eb42b7fc3ea3","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"3094558"}, "reason": "RegistrationTokenUpdated", "message": "Successfully update registration token"}
2022-03-30T19:43:33Z    DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/mutate-runner-set-pod", "UID": "bdc015c8-23d2-48f7-aac2-de7c41239dfb", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
2022-03-30T19:43:33Z    DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/mutate-runner-set-pod", "code": 200, "reason": "", "UID": "bdc015c8-23d2-48f7-aac2-de7c41239dfb", "allowed": true}
2022-03-30T19:43:33Z    INFO    actions-runner-controller.runner    Created runner pod  {"runner": "github/aks-runner-dv722-7zssb", "repository": ""}
2022-03-30T19:43:33Z    DEBUG   events  Normal  {"object": {"kind":"Runner","namespace":"github","name":"aks-runner-dv722-7zssb","uid":"6f85f2b4-2c28-4a0b-8415-eb42b7fc3ea3","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"3094561"}, "reason": "PodCreated", "message": "Created pod 'aks-runner-dv722-7zssb'"}
2022-03-30T19:43:34Z    DEBUG   actions-runner-controller.runner    Runner appears to have been registered and running. {"runner": "github/aks-runner-dv722-7zssb", "podCreationTimestamp": "2022-03-30 19:43:33 +0000 UTC"}

runner logs:

Docker enabled runner detected and Docker daemon wait is enabled
Waiting until Docker is avaliable or the timeout is reached
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
Github endpoint URL https://github.com/
Passing --ephemeral to config.sh to enable the ephemeral runner.
Configuring the runner.

--------------------------------------------------------------------------------
|        ____ _ _   _   _       _          _        _   _                      |
|       / ___(_) |_| | | |_   _| |__      / \   ___| |_(_) ___  _ __  ___      |
|      | |  _| | __| |_| | | | | '_ \    / _ \ / __| __| |/ _ \| '_ \/ __|     |
|      | |_| | | |_|  _  | |_| | |_) |  / ___ \ (__| |_| | (_) | | | \__ \     |
|       \____|_|\__|_| |_|\__,_|_.__/  /_/   \_\___|\__|_|\___/|_| |_|___/     |
|                                                                              |
|                       Self-hosted runner registration                        |
|                                                                              |
--------------------------------------------------------------------------------

# Authentication

√ Connected to GitHub

# Runner Registration

√ Runner successfully added
√ Runner connection is good

# Runner settings

√ Settings Saved.

Runner successfully configured.
{
  "agentId": 132,
  "agentName": "aks-runner-dv722-8sbzn",
  "poolId": 1,
  "poolName": "Default",
  "ephemeral": true,
  "serverUrl": "https://pipelines.actions.githubusercontent.com/XXX",
  "gitHubUrl": "https://github.com/XXX",
  "workFolder": "/runner/_work"
}13,14d12
< nodever=${GITHUB_ACTIONS_RUNNER_FORCED_NODE_VERSION:-node16}
<
15a14
>
17c16
< ./externals/$nodever/bin/node ./bin/RunnerService.js &
---
> ./externals/node12/bin/node ./bin/RunnerService.js $* &
21c20
< wait $PID
---
> wait $PID
\ No newline at end of file
19c19
< var runService = function () {
---
> var runService = function() {
23c23
<     if (!stopping) {
---
>     if(!stopping) {
27c27
<                 listener = childProcess.spawn(listenerExePath, ['run'], { env: process.env });
---
>                 listener = childProcess.spawn(listenerExePath, ['run'].concat(process.argv.slice(3)), { env: process.env });
30c30
<                 listener = childProcess.spawn(listenerExePath, ['run', '--startuptype', 'service'], { env: process.env });
---
>                 listener = childProcess.spawn(listenerExePath, ['run', '--startuptype', 'service'].concat(process.argv.slice(2)), { env: process.env });
33c33
<             console.log(`Started listener process, pid: ${listener.pid}`);
---
>             console.log('Started listener process');
43,46d42
<             listener.on("error", (err) => {
<                 console.log(`Runner listener fail to start with error ${err.message}`);
<             });
<
64c60
<                 if (!stopping) {
---
>                 if(!stopping) {
69c65
<         } catch (ex) {
---
>         } catch(ex) {
78c74
< var gracefulShutdown = function (code) {
---
> var gracefulShutdown = function(code) {
85,86c81
<         console.log('Sending SIGKILL to runner listener');
<         setTimeout(() => listener.kill('SIGKILL'), 30000).unref();
---
>         // TODO wait for 30 seconds and send a SIGKILL
96c91
< });
---
> });
\ No newline at end of file
.path=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/runner/.local/bin
Starting Runner listener with startup type: service
Started listener process
Started running service

√ Connected to GitHub

Current runner version: '2.288.1'
2022-03-30 19:25:52Z: Listening for Jobs
2022-03-30 19:27:25Z: Running job: greet
2022-03-30 19:27:31Z: Job greet completed with result: Succeeded
√ Removed .credentials
√ Removed .runner
Runner listener exited with error code 0
Runner listener exit with 0 return code, stop the service, no retry needed.

kubectl describe pod aks-runner-dv722-8sbzn -n github while pod in NotReady state

Name:         aks-runner-dv722-8sbzn
Namespace:    github
Priority:     0
Node:         XXX
Start Time:   Wed, 30 Mar 2022 12:25:40 -0700
Labels:       actions-runner-controller/inject-registration-token=true
              pod-template-hash=78c748848
              runner-deployment-name=aks-runner
              runner-template-hash=595777c875
              runnerset-name=aks-runner-dv722-8sbzn
Annotations:  actions-runner-controller/token-expires-at: 2022-03-30T13:25:39-07:00
              actions-runner/id: 132
              cni.projectcalico.org/containerID: 6d87831d6d808db95ec7e275c7a8ed70000c5d3d2ccd5d0ca985389b9cb2861c
              cni.projectcalico.org/podIP: XXX
              cni.projectcalico.org/podIPs: XXX
              sync-time: 2022-03-30T19:25:39Z
Status:       Running
IP:           XXX
IPs:
  IP:           XXX
Controlled By:  Runner/aks-runner-dv722-8sbzn
Containers:
  runner:
    Container ID:   containerd://e4ae87f6c3ed6ddd7bc0d032295461920b2707ac41c94f336e2d09f4a9c71c52
    Image:          XXX
    Image ID:       XXX
    Port:           <none>
    Host Port:      <none>
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 30 Mar 2022 12:25:41 -0700
      Finished:     Wed, 30 Mar 2022 12:27:31 -0700
    Ready:          False
    Restart Count:  0
    Environment:
      RUNNER_ORG:                     XXX
      RUNNER_REPO:
      RUNNER_ENTERPRISE:
      RUNNER_LABELS:
      RUNNER_GROUP:
      DOCKER_ENABLED:                 true
      DOCKERD_IN_RUNNER:              false
      GITHUB_URL:                     https://github.com/
      RUNNER_WORKDIR:                 /runner/_work
      RUNNER_EPHEMERAL:               true
      DOCKER_HOST:                    tcp://localhost:2376
      DOCKER_TLS_VERIFY:              1
      DOCKER_CERT_PATH:               /certs/client
      RUNNER_FEATURE_FLAG_EPHEMERAL:  true
      RUNNER_NAME:                    aks-runner-dv722-8sbzn
      RUNNER_TOKEN:                   XXX
    Mounts:
      /certs/client from certs-client (ro)
      /runner from runner (rw)
      /runner/_work from work (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ptc99 (ro)
  docker:
    Container ID:   containerd://3b8cc553f154df451cba8cafed557f38fdd3d122ca7adbbf68c2905b67ef2adf
    Image:          docker:dind
    Image ID:       docker.io/library/docker@sha256:1452760def210c3cc517fb4d4cb7c7b978badc5568b150914e0b5956950c67b2
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Wed, 30 Mar 2022 12:25:41 -0700
    Ready:          True
    Restart Count:  0
    Environment:
      DOCKER_TLS_CERTDIR:  /certs
    Mounts:
      /certs/client from certs-client (rw)
      /runner from runner (rw)
      /runner/_work from work (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ptc99 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  runner:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  work:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  certs-client:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  kube-api-access-ptc99:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  2m20s  default-scheduler  Successfully assigned github/aks-runner-dv722-8sbzn to XXX
  Normal  Pulling    2m20s  kubelet            Pulling image "XXX"
  Normal  Pulled     2m19s  kubelet            Successfully pulled image "XXX" in 133.189239ms
  Normal  Created    2m19s  kubelet            Created container runner
  Normal  Started    2m19s  kubelet            Started container runner
  Normal  Pulled     2m19s  kubelet            Container image "docker:dind" already present on machine
  Normal  Created    2m19s  kubelet            Created container docker
  Normal  Started    2m19s  kubelet            Started container docker
mumoshu commented 2 years ago

the pod stays in READY: 1/2 STATUS: NotReady

This is normal, as you use ephemeral runner and an ephemeral runner exists with 0 once it completed running the first job(you can see that from the pod status for the first container in the runner pod)

for 10-20 minutes.

This seems a bit unusual, because it should clean up pods of completed runners earlier.

During this time the runner does not show up in GitHub runners list.

It's an expected behavior because the runner appears to GitHub only while it's registered to GitHub. An ephemeral runner unregisters itself after the first job run so if it already stopped with code 0 it's natural it doesn't show up in the GitHub runner list.

mumoshu commented 2 years ago

@nicholasgibson2 Can you share more logs the controller? I'm especially interested in what happened to the controller while reconciling the runnerreplicaset and runner resources that are the owners of the runner pod aks-runner-dv722-8sbzn. The runner resource should be named aks-runner-dv722-8sbzn and runnerreplicaset resource should be named aks-runner-dv722. Could you share all the logs related to those resources too?

cspargo commented 2 years ago

I had the same issue, but I had to rollback to the previous version before I could collect a lot of information.

I wonder if the controller sync-period option could have anything to do with the delay? It appears to default to 10 minutes.

I recall that between the time the runner container ended, and the time that the pod got deleted, nothing was logged for the runner. I did record the below snippet of logs for one of the runners that had the issue. In this case the runner ran a short job, (probably) for less than 1 minute, then the runner container ended and it went into NotReady status until it was removed at 06:23

2022-03-29T06:13:16Z    INFO    actions-runner-controller.runner        Created runner pod      {"runner": "actions-runner/org-xxx-eng-22s76-prbxh", "repository": ""}
2022-03-29T06:13:22Z    DEBUG   actions-runner-controller.runner        Runner appears to have been registered and running.     {"runner": "actions-runner/org-xxx-eng-22s76-prbxh", "podCreationTimestamp": "2022-03-29 06:13:16 +0000 UTC"}
2022-03-29T06:23:46Z    INFO    runner-resource validate resource to be updated {"name": "org-xxx-eng-22s76-prbxh"}
2022-03-29T06:23:46Z    INFO    actions-runner-controller.runner        Removed finalizer       {"runner": "actions-runner/org-xxx-eng-22s76-prbxh"}
2022-03-29T06:23:47Z    INFO    actions-runner-controller.runnerpod     Runner pod has been stopped with a successful status.   {"runnerpod": "actions-runner/org-xxx-eng-22s76-prbxh"}
mumoshu commented 2 years ago

@cspargo Ah good catch! The default sync-period can be the source of the issue. I've been using very short sync-period in my local test envs these days(like 10 seconds or so) because it doesn't result in spamming GitHub API calls anymore.

@nicholasgibson2 Can you confirm that NotReady runner pods eventually disappear and they disappear earlier if you shorten sync-period?

nicholasgibson2 commented 2 years ago

@mumoshu I confirmed the delay is directly affected by sync-period. After testing with several different durations the delay is consistently taking 2x whatever the duration is set as, as if it misses the first sync interval for whatever reason.

It's interesting that there is no delay at all in v0.21.0 regardless of sync-period. Any ideas on what change between v0.21.0 -> v0.22.0 would have caused this?

toast-gear commented 2 years ago

@nicholasgibson2 this isn't the fix but for debug purposes could perform the same experiments with the single container runner setup (docker within the runner container itself rather than as a sidecar) and report back the results please? https://github.com/actions-runner-controller/actions-runner-controller#runner-with-dind

Also what is your scaleDownDelaySecondsAfterScaleOut configuration for these tests?

nicholasgibson2 commented 2 years ago

@toast-gear pods with dockerdWithinRunnerContainer: true don't have the delay! I tested a couple times enabling and disabling this flag and the results are consistent. scaleDownDelaySecondsAfterScaleOut is the default value (10 minutes I believe).

toast-gear commented 2 years ago

Yeh it's as I thought would happen, it will "fix" the problem for you but it's more of a workaround than a fix. The new behaviour is more deterministic and less susceptible to race conditions however at the cost of performance, we missed testing with defaults to be honest as they're not really appropriate for live configurations.

We have some changes to do as our defaults are no longer appropriate with the new behaviour as well as tie the lifecycle of the pod together and open up the option for lower sync periods with appropriate warnings about the risk of being rate limited.

mumoshu commented 2 years ago

Thanks, everyone!

I think this is basically a regression in runner replicaset and runner controllers.

In 0.21.x runner controller had been notified for every update in the runner pod spec and status changes so regardless of the sync period it was able to tear down the completed pod as notified.

Since 0.22.x, it’s runner replicarset controller’s responsibility to sync runner pods behind the runner resources. runner replicaset controller is not configured to be notified for any updates in the runner pods that are transitive deps of runner replicasets, hence the runner pod tear down freq is dependent on the sync period. The sync period is shared across all the controllers included in the ARC’s controller-manager.

That said, I believe setting sync-period to very low value (like 10s as I’ve using successfully in my E2E test runs) would be a good workaround that works today.

A fix for this should result in code changes in maybe the runner controller and the runner status so that any important changes(like the runner container in the runner pod stopped due to completion) are notified to the runner replicaset controller via the runner controller.

mumoshu commented 2 years ago

@nicholasgibson2 I can't thank you enough for your detailed report.

Status: Running in your describe runner pod output is especially helpful, as I can see that status.phase is still Running and that validates my theory. Our current runner status only propagates phase from the runner pod so it usually looks like this:

  status:
    phase: Running
    registration:
      expiresAt: "2022-03-30T12:22:16Z"
      labels:
      - your_runner_label
      organization: your_github_org_here
      token: your_reg_token_here

The runner replicaset is notified about changes in the runner status but as it doesn't reflect the runner pod's change in status.conditions(which corresponds to NotReady if the condition whose type is Ready is set to False), the runner replicaset controller should be unable to react on NotReady runner pods until next sync period.

I'd add status.conditions (or subset of it) to the runner status and see if it resolves the issue.

stefangluszek commented 2 years ago

I have upgraded to v0.22.2 and it's still not working for me.

Status:
  Phase:  Running
  Registration:
    Expires At:  2022-04-06T07:22:01Z

and the pod status it not ready:

stefang-runner-vfqpb-6qrs5   1/2     NotReady   0          6m54s

Shouldn't I see Ready field under Status?

mumoshu commented 2 years ago

@stefangluszek Hey. Could you double-check if you did upgrade CRDs? helm upgrade doesn't upgrade CRDs, see the upgrade docs.

Harleyzheng commented 2 years ago

+1. Experiencing the same symptom. Please see screenshot:

image
Harleyzheng commented 2 years ago

Sorry. I just saw in the comments. Will try updating sync-period. Thanks

mumoshu commented 2 years ago

@Harleyzheng Hey. What's your ARC version?

Harleyzheng commented 2 years ago

it was 0.22.0

mumoshu commented 2 years ago

@Harleyzheng Could you please upgrade to 0.22.2? It's the first version that incorporates https://github.com/actions-runner-controller/actions-runner-controller/pull/1299 which has potential to fix your issue without very short sync-period.

Harleyzheng commented 2 years ago

Just did it. The issue is gone now in 0.22.2.

toast-gear commented 2 years ago

We'll close this, v0.22.2 should have resolved this problem.

stefangluszek commented 2 years ago

@stefangluszek Hey. Could you double-check if you did upgrade CRDs? helm upgrade doesn't upgrade CRDs, see the upgrade docs.

I wasn't aware of that. It works as expected after upgrading CDRs.

stanxing commented 2 years ago

I have same problem in 0.22.3

mumoshu commented 2 years ago

@stanxing Probably you missed upgrading CRDs. Otherwise try reinstalling everything from scratch and see if that fixes the issue. All the reported issues have been already resolved so we have nowhere to investigate without additional information!

stanxing commented 2 years ago

@mumoshu I installed runner controller with helm, Is the CRD resource in helm the latest?

NAME                            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                                   APP VERSION
actions-runner-controller       infra           2               2022-04-25 14:59:13.720517 +0800 CST    deployed        actions-runner-controller-0.17.3        0.22.3   
toast-gear commented 2 years ago

@stanxing see the upgrade docs for Helm

stanxing commented 2 years ago

Thanks for your reply, I reinstalled helm chart and upgraded the CRD Resource for helm and, the problem is still existed.

helm version:

actions-runner-controller       infra           1               2022-04-25 17:28:53.87602 +0800 CST     deployed        actions-runner-controller-0.17.3        0.22.3     

then I run kubectl replace -f crds/. which version is 0.17.3

runner yaml :

# runner.yaml
apiVersion: actions.summerwind.dev/v1alpha1
kind: Runner
metadata:
  name: dbt-runner
  namespace: github-runners
spec:
  repository: xxx/yyy
  env: []

After I trigged a workflow, job completed but the pod status as below:

github-runners        dbt-runner                                                            1/2     NotReady    0          5m55s   172.31.75.99    ip-172-31-77-48.ap-northeast-1.compute.internal    <none>           <none>

Controller logs:

I0425 09:29:01.088158       1 request.go:665] Waited for 1.024631337s due to client-side throttling, not priority and fairness, request: GET:https://10.100.0.1:443/apis/kibana.k8s.elastic.co/v1beta1?timeout=32s
2022-04-25T09:29:02Z    INFO    controller-runtime.metrics      Metrics server is starting to listen    {"addr": "127.0.0.1:8080"}
2022-04-25T09:29:02Z    INFO    actions-runner-controller       Initializing actions-runner-controller  {"github-api-cache-duration": "9m50s", "sync-period": "10m0s", "runner-image": "summerwind/actions-runner:latest", "docker-image": "docker:dind", "common-runnner-labels": null, "watch-namespace": ""}
2022-04-25T09:29:02Z    INFO    controller-runtime.builder      Registering a mutating webhook  {"GVK": "actions.summerwind.dev/v1alpha1, Kind=Runner", "path": "/mutate-actions-summerwind-dev-v1alpha1-runner"}
2022-04-25T09:29:02Z    INFO    controller-runtime.webhook      Registering webhook     {"path": "/mutate-actions-summerwind-dev-v1alpha1-runner"}
2022-04-25T09:29:02Z    INFO    controller-runtime.builder      Registering a validating webhook        {"GVK": "actions.summerwind.dev/v1alpha1, Kind=Runner", "path": "/validate-actions-summerwind-dev-v1alpha1-runner"}
2022-04-25T09:29:02Z    INFO    controller-runtime.webhook      Registering webhook     {"path": "/validate-actions-summerwind-dev-v1alpha1-runner"}
2022-04-25T09:29:02Z    INFO    controller-runtime.builder      Registering a mutating webhook  {"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerDeployment", "path": "/mutate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2022-04-25T09:29:02Z    INFO    controller-runtime.webhook      Registering webhook     {"path": "/mutate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2022-04-25T09:29:02Z    INFO    controller-runtime.builder      Registering a validating webhook        {"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerDeployment", "path": "/validate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2022-04-25T09:29:02Z    INFO    controller-runtime.webhook      Registering webhook     {"path": "/validate-actions-summerwind-dev-v1alpha1-runnerdeployment"}
2022-04-25T09:29:02Z    INFO    controller-runtime.builder      Registering a mutating webhook  {"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerReplicaSet", "path": "/mutate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2022-04-25T09:29:02Z    INFO    controller-runtime.webhook      Registering webhook     {"path": "/mutate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2022-04-25T09:29:02Z    INFO    controller-runtime.builder      Registering a validating webhook        {"GVK": "actions.summerwind.dev/v1alpha1, Kind=RunnerReplicaSet", "path": "/validate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2022-04-25T09:29:02Z    INFO    controller-runtime.webhook      Registering webhook     {"path": "/validate-actions-summerwind-dev-v1alpha1-runnerreplicaset"}
2022-04-25T09:29:02Z    INFO    controller-runtime.webhook      Registering webhook     {"path": "/mutate-runner-set-pod"}
2022-04-25T09:29:02Z    INFO    actions-runner-controller       starting manager
2022-04-25T09:29:02Z    INFO    controller-runtime.webhook.webhooks     Starting webhook server
2022-04-25T09:29:02Z    INFO    Starting server {"path": "/metrics", "kind": "metrics", "addr": "127.0.0.1:8080"}
2022-04-25T09:29:02Z    INFO    controller-runtime.certwatcher  Updated current TLS certificate
2022-04-25T09:29:02Z    INFO    controller-runtime.webhook      Serving webhook server  {"host": "", "port": 9443}
2022-04-25T09:29:02Z    INFO    controller-runtime.certwatcher  Starting certificate watcher
I0425 09:29:02.247091       1 leaderelection.go:248] attempting to acquire leader lease infra/actions-runner-controller...
I0425 09:29:20.519628       1 leaderelection.go:258] successfully acquired lease infra/actions-runner-controller
2022-04-25T09:29:20Z    DEBUG   events  Normal  {"object": {"kind":"ConfigMap","namespace":"infra","name":"actions-runner-controller","uid":"f98eefae-c262-4454-8c09-3395255724bc","apiVersion":"v1","resourceVersion":"68987636"}, "reason": "LeaderElection", "message": "actions-runner-controller-5d8f9f695c-tzpxr_364e0751-1f53-46d0-848f-e5a323e93cb4 became leader"}
2022-04-25T09:29:20Z    DEBUG   events  Normal  {"object": {"kind":"Lease","namespace":"infra","name":"actions-runner-controller","uid":"de6e0924-8b3c-4611-bf2b-1f97f1e5f9eb","apiVersion":"coordination.k8s.io/v1","resourceVersion":"68987637"}, "reason": "LeaderElection", "message": "actions-runner-controller-5d8f9f695c-tzpxr_364e0751-1f53-46d0-848f-e5a323e93cb4 became leader"}
2022-04-25T09:29:20Z    INFO    controller.runner-controller    Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "Runner", "source": "kind source: *v1alpha1.Runner"}
2022-04-25T09:29:20Z    INFO    controller.runner-controller    Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "Runner", "source": "kind source: *v1.Pod"}
2022-04-25T09:29:20Z    INFO    controller.runner-controller    Starting Controller     {"reconciler group": "actions.summerwind.dev", "reconciler kind": "Runner"}
2022-04-25T09:29:20Z    INFO    controller.runnerdeployment-controller  Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerDeployment", "source": "kind source: *v1alpha1.RunnerDeployment"}
2022-04-25T09:29:20Z    INFO    controller.runnerreplicaset-controller  Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerReplicaSet", "source": "kind source: *v1alpha1.RunnerReplicaSet"}
2022-04-25T09:29:20Z    INFO    controller.runnerdeployment-controller  Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerDeployment", "source": "kind source: *v1alpha1.RunnerReplicaSet"}
2022-04-25T09:29:20Z    INFO    controller.runnerdeployment-controller  Starting Controller     {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerDeployment"}
2022-04-25T09:29:20Z    INFO    controller.runnerreplicaset-controller  Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerReplicaSet", "source": "kind source: *v1alpha1.Runner"}
2022-04-25T09:29:20Z    INFO    controller.runnerreplicaset-controller  Starting Controller     {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerReplicaSet"}
2022-04-25T09:29:20Z    INFO    controller.runnerpod-controller Starting EventSource    {"reconciler group": "", "reconciler kind": "Pod", "source": "kind source: *v1.Pod"}
2022-04-25T09:29:20Z    INFO    controller.runnerpod-controller Starting Controller     {"reconciler group": "", "reconciler kind": "Pod"}
2022-04-25T09:29:20Z    INFO    controller.horizontalrunnerautoscaler-controller        Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "HorizontalRunnerAutoscaler", "source": "kind source: *v1alpha1.HorizontalRunnerAutoscaler"}
2022-04-25T09:29:20Z    INFO    controller.horizontalrunnerautoscaler-controller        Starting Controller     {"reconciler group": "actions.summerwind.dev", "reconciler kind": "HorizontalRunnerAutoscaler"}
2022-04-25T09:29:20Z    INFO    controller.runnerset-controller Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerSet", "source": "kind source: *v1alpha1.RunnerSet"}
2022-04-25T09:29:20Z    INFO    controller.runnerset-controller Starting EventSource    {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerSet", "source": "kind source: *v1.StatefulSet"}
2022-04-25T09:29:20Z    INFO    controller.runnerset-controller Starting Controller     {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerSet"}
2022-04-25T09:29:21Z    INFO    controller.runnerpod-controller Starting workers        {"reconciler group": "", "reconciler kind": "Pod", "worker count": 1}
2022-04-25T09:29:21Z    INFO    controller.runnerdeployment-controller  Starting workers        {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerDeployment", "worker count": 1}
2022-04-25T09:29:21Z    INFO    controller.horizontalrunnerautoscaler-controller        Starting workers        {"reconciler group": "actions.summerwind.dev", "reconciler kind": "HorizontalRunnerAutoscaler", "worker count": 1}
2022-04-25T09:29:21Z    INFO    controller.runnerset-controller Starting workers        {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerSet", "worker count": 1}
2022-04-25T09:29:21Z    INFO    controller.runnerreplicaset-controller  Starting workers        {"reconciler group": "actions.summerwind.dev", "reconciler kind": "RunnerReplicaSet", "worker count": 1}
2022-04-25T09:29:21Z    INFO    controller.runner-controller    Starting workers        {"reconciler group": "actions.summerwind.dev", "reconciler kind": "Runner", "worker count": 1}
2022-04-25T09:29:36Z    INFO    actions-runner-controller.runnerpod     Runner pod has been stopped with a successful status.   {"runnerpod": "github-runners/dbt-runner"}
2022-04-25T09:29:47Z    INFO    actions-runner-controller.runner        Updated registration token      {"runner": "dbt-runner", "repository": "datawaves-xyz/dbt_ethereum"}
2022-04-25T09:29:47Z    DEBUG   events  Normal  {"object": {"kind":"Runner","namespace":"github-runners","name":"dbt-runner","uid":"c69b97f1-ed65-497a-892e-b503f78d4fc5","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"68987831"}, "reason": "RegistrationTokenUpdated", "message": "Successfully update registration token"}
2022-04-25T09:29:47Z    DEBUG   controller-runtime.webhook.webhooks     received request        {"webhook": "/mutate-runner-set-pod", "UID": "91931d99-331a-4f17-857e-c70224c4b236", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
2022-04-25T09:29:47Z    DEBUG   controller-runtime.webhook.webhooks     wrote response  {"webhook": "/mutate-runner-set-pod", "code": 200, "reason": "", "UID": "91931d99-331a-4f17-857e-c70224c4b236", "allowed": true}
2022-04-25T09:29:47Z    INFO    actions-runner-controller.runner        Created runner pod      {"runner": "github-runners/dbt-runner", "repository": "datawaves-xyz/dbt_ethereum"}
2022-04-25T09:29:47Z    DEBUG   events  Normal  {"object": {"kind":"Runner","namespace":"github-runners","name":"dbt-runner","uid":"c69b97f1-ed65-497a-892e-b503f78d4fc5","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"68987932"}, "reason": "PodCreated", "message": "Created pod 'dbt-runner'"}
2022-04-25T09:29:51Z    DEBUG   actions-runner-controller.runner        Runner appears to have been registered and running.     {"runner": "github-runners/dbt-runner", "podCreationTimestamp": "2022-04-25 09:29:47 +0000 UTC"}
2022-04-25T09:30:25Z    DEBUG   actions-runner-controller.runner        Runner appears to have been registered and running.     {"runner": "github-runners/dbt-runner", "podCreationTimestamp": "2022-04-25 09:29:47 +0000 UTC"}
2022-04-25T09:41:55Z    INFO    actions-runner-controller.runnerpod     Unregistration started before runner ID is assigned. Perhaps the runner pod was terminated by anyone other than ARC? Was it OOM killed? Marking unregistration as completed anyway because there's nothing ARC can do.   {"runnerpod": "github-runners/dbt-runner"}
2022-04-25T09:41:55Z    INFO    actions-runner-controller.runnerpod     Runner pod is marked as already unregistered.   {"runnerpod": "github-runners/dbt-runner"}
2022-04-25T09:42:06Z    DEBUG   controller-runtime.webhook.webhooks     received request        {"webhook": "/mutate-runner-set-pod", "UID": "003a3510-cb02-40c0-992f-bebb8d822f3a", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
2022-04-25T09:42:06Z    DEBUG   controller-runtime.webhook.webhooks     wrote response  {"webhook": "/mutate-runner-set-pod", "code": 200, "reason": "", "UID": "003a3510-cb02-40c0-992f-bebb8d822f3a", "allowed": true}
2022-04-25T09:42:06Z    INFO    actions-runner-controller.runner        Created runner pod      {"runner": "github-runners/dbt-runner", "repository": "datawaves-xyz/dbt_ethereum"}
2022-04-25T09:42:06Z    DEBUG   events  Normal  {"object": {"kind":"Runner","namespace":"github-runners","name":"dbt-runner","uid":"c69b97f1-ed65-497a-892e-b503f78d4fc5","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"68996126"}, "reason": "PodCreated", "message": "Created pod 'dbt-runner'"}
2022-04-25T09:42:11Z    DEBUG   actions-runner-controller.runner        Runner appears to have been registered and running.     {"runner": "github-runners/dbt-runner", "podCreationTimestamp": "2022-04-25 09:42:06 +0000 UTC"}
2022-04-25T09:46:26Z    DEBUG   actions-runner-controller.runner        Runner appears to have been registered and running.     {"runner": "github-runners/dbt-runner", "podCreationTimestamp": "2022-04-25 09:42:06 +0000 UTC"}
2022-04-25T09:50:57Z    DEBUG   controller-runtime.webhook.webhooks     received request        {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "4b961c3b-8d87-4d3a-a056-484f11869164", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-04-25T09:50:57Z    DEBUG   controller-runtime.webhook.webhooks     wrote response  {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "4b961c3b-8d87-4d3a-a056-484f11869164", "allowed": true}
2022-04-25T09:50:57Z    DEBUG   controller-runtime.webhook.webhooks     received request        {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "2622bfa2-963d-4e45-8f01-d99da172c7ca", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-04-25T09:50:57Z    INFO    runner-resource validate resource to be updated {"name": "dbt-runner"}
2022-04-25T09:50:57Z    DEBUG   controller-runtime.webhook.webhooks     wrote response  {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "2622bfa2-963d-4e45-8f01-d99da172c7ca", "allowed": true}
2022-04-25T09:50:57Z    INFO    actions-runner-controller.runner        Removed finalizer       {"runner": "github-runners/dbt-runner"}
2022-04-25T09:50:57Z    INFO    actions-runner-controller.runnerpod     Runner pod has been stopped with a successful status.   {"runnerpod": "github-runners/dbt-runner"}
2022-04-25T09:50:57Z    INFO    actions-runner-controller.runnerpod     Runner pod is marked as already unregistered.   {"runnerpod": "github-runners/dbt-runner"}
2022-04-25T09:53:29Z    DEBUG   controller-runtime.webhook.webhooks     received request        {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "455aaac1-f7ab-40b1-8156-73ddfa4d21e0", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-04-25T09:53:29Z    DEBUG   controller-runtime.webhook.webhooks     wrote response  {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "455aaac1-f7ab-40b1-8156-73ddfa4d21e0", "allowed": true}
2022-04-25T09:53:29Z    DEBUG   controller-runtime.webhook.webhooks     received request        {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "0e2718a8-7bc7-47ad-9e7d-ff77e5029de9", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-04-25T09:53:29Z    INFO    runner-resource validate resource to be created {"name": "dbt-runner"}
2022-04-25T09:53:29Z    DEBUG   controller-runtime.webhook.webhooks     wrote response  {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "0e2718a8-7bc7-47ad-9e7d-ff77e5029de9", "allowed": true}
2022-04-25T09:53:29Z    DEBUG   controller-runtime.webhook.webhooks     received request        {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "7c44988c-6e84-4752-a3b8-cd44e24324f7", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-04-25T09:53:29Z    DEBUG   controller-runtime.webhook.webhooks     wrote response  {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "7c44988c-6e84-4752-a3b8-cd44e24324f7", "allowed": true}
2022-04-25T09:53:29Z    DEBUG   controller-runtime.webhook.webhooks     received request        {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "99d53b58-de7e-4394-8445-a6951efac291", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
2022-04-25T09:53:29Z    INFO    runner-resource validate resource to be updated {"name": "dbt-runner"}
2022-04-25T09:53:29Z    DEBUG   controller-runtime.webhook.webhooks     wrote response  {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "99d53b58-de7e-4394-8445-a6951efac291", "allowed": true}
2022-04-25T09:53:29Z    INFO    actions-runner-controller.runner        Updated registration token      {"runner": "dbt-runner", "repository": "datawaves-xyz/dbt_ethereum"}
2022-04-25T09:53:29Z    DEBUG   events  Normal  {"object": {"kind":"Runner","namespace":"github-runners","name":"dbt-runner","uid":"17fabcc9-da11-48c5-95f3-9d79e267d58b","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"69003245"}, "reason": "RegistrationTokenUpdated", "message": "Successfully update registration token"}
2022-04-25T09:53:29Z    DEBUG   controller-runtime.webhook.webhooks     received request        {"webhook": "/mutate-runner-set-pod", "UID": "ce8bb351-abbe-4138-89fc-959b39f42155", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
2022-04-25T09:53:29Z    DEBUG   controller-runtime.webhook.webhooks     wrote response  {"webhook": "/mutate-runner-set-pod", "code": 200, "reason": "", "UID": "ce8bb351-abbe-4138-89fc-959b39f42155", "allowed": true}
2022-04-25T09:53:29Z    INFO    actions-runner-controller.runner        Created runner pod      {"runner": "github-runners/dbt-runner", "repository": "datawaves-xyz/dbt_ethereum"}
2022-04-25T09:53:29Z    DEBUG   events  Normal  {"object": {"kind":"Runner","namespace":"github-runners","name":"dbt-runner","uid":"17fabcc9-da11-48c5-95f3-9d79e267d58b","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"69003246"}, "reason": "PodCreated", "message": "Created pod 'dbt-runner'"}
2022-04-25T09:53:33Z    DEBUG   actions-runner-controller.runner        Runner appears to have been registered and running.     {"runner": "github-runners/dbt-runner", "podCreationTimestamp": "2022-04-25 09:53:29 +0000 UTC"}
2022-04-25T09:56:09Z    DEBUG   actions-runner-controller.runner        Runner appears to have been registered and running.     {"runner": "github-runners/dbt-runner", "podCreationTimestamp": "2022-04-25 09:53:29 +0000 UTC"}

Any ideas for my problem?

toast-gear commented 2 years ago

Delete the controller pod (kubectl delete pod $controller_pod_name) so it is redeployed, I don't know if the pod coming up before the CRDs were replaced will matter or not

stanxing commented 2 years ago

Sorry to bother you again. I can not resolve the problem. I found When first container exited with code 0. The second contaienr docker:dind is still running.

github-runners          dbt-runner                                                            1/2     NotReady      0          19h     172.31.75.99    ip-172-31-77-48.ap-northeast-1.compute.internal    <none>           <none>

This is some docker:dind logs:

...
time="2022-04-25T11:03:29.411671688Z" level=info msg="Loading containers: done."
time="2022-04-25T11:03:29.421987862Z" level=info msg="Docker daemon" commit=87a90dc graphdriver(s)=overlay2 version=20.10.14
time="2022-04-25T11:03:29.422181722Z" level=info msg="Daemon has completed initialization"
time="2022-04-25T11:03:29.460090246Z" level=info msg="API listen on /var/run/docker.sock"
time="2022-04-25T11:03:29.467074163Z" level=info msg="API listen on [::]:2376"

k describe pod dbt-runner -n github-runners

Name:         dbt-runner
Namespace:    github-runners
Priority:     0
Node:         ip-172-31-77-48.ap-northeast-1.compute.internal/172.31.77.48
Start Time:   Mon, 25 Apr 2022 19:03:25 +0800
Labels:       actions-runner-controller/inject-registration-token=true
              pod-template-hash=577d8bcf4
              runnerset-name=dbt-runner
Annotations:  actions-runner-controller/token-expires-at: 2022-04-25T20:03:25+08:00
              actions-runner/id: 27
              kubernetes.io/psp: eks.privileged
Status:       Running
IP:           172.31.75.99
IPs:
  IP:           172.31.75.99
Controlled By:  Runner/dbt-runner
Containers:
  runner:
    Container ID:   docker://43c33abdf2f3668744da47a1e4055e662f4fc6a71bae7ac4630eb9fc1bcbfa44
    Image:          summerwind/actions-runner:latest
    Image ID:       docker-pullable://summerwind/actions-runner@sha256:4a04348fc52ecd3401a3c6bb150d176a87640e8ecbe34459bc55db317b587410
    Port:           <none>
    Host Port:      <none>
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 25 Apr 2022 19:03:46 +0800
      Finished:     Mon, 25 Apr 2022 19:04:40 +0800
    Ready:          False
    Restart Count:  0
    Environment:
      RUNNER_ORG:                     
      RUNNER_REPO:                    datawaves-xyz/dbt_ethereum
      RUNNER_ENTERPRISE:              
      RUNNER_LABELS:                  
      RUNNER_GROUP:                   
      DOCKER_ENABLED:                 true
      DOCKERD_IN_RUNNER:              false
      GITHUB_URL:                     https://github.com/
      RUNNER_WORKDIR:                 /runner/_work
      RUNNER_EPHEMERAL:               true
      DOCKER_HOST:                    tcp://localhost:2376
      DOCKER_TLS_VERIFY:              1
      DOCKER_CERT_PATH:               /certs/client
      RUNNER_FEATURE_FLAG_EPHEMERAL:  true
      RUNNER_NAME:                    dbt-runner
      RUNNER_TOKEN:                   AFRVWRXYI5XSQEPBQVLWZ2DCM2HA2
    Mounts:
      /certs/client from certs-client (ro)
      /runner from runner (rw)
      /runner/_work from work (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-t4vm8 (ro)
  docker:
    Container ID:   docker://54947f2830bc627256ee5f6a3a42d4607da2e5a3c7fd5ac4140192a638692e10
    Image:          docker:dind
    Image ID:       docker-pullable://docker@sha256:210076c7772f47831afaf7ff200cf431c6cd191f0d0cb0805b1d9a996e99fb5e
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Mon, 25 Apr 2022 19:03:28 +0800
    Ready:          True
    Restart Count:  0
    Environment:
      DOCKER_TLS_CERTDIR:  /certs
    Mounts:
      /certs/client from certs-client (rw)
      /runner from runner (rw)
      /runner/_work from work (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-t4vm8 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  runner:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  work:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  certs-client:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  kube-api-access-t4vm8:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              eks.amazonaws.com/capacityType=ON_DEMAND
Tolerations:                 node.kubernetes.io/not-ready:NoExecute for 300s
                             node.kubernetes.io/unreachable:NoExecute for 300s
Events:                      <none>

Can anyone tell me how to debug it?

mumoshu commented 2 years ago

@stanxing Are you using Runner resource only, right? Then it seems like working as expected. It creates a runner pod that runs an ephemeral runner inside, and a Runner resource never recreate the ephemeral runner pod, because doing so result in a race condition that is intended to be fixed by ARC 0.22.x.

Other folks are using RunnerDeployment which is capable of replacing completed Runner and its corresponding runner pod with another Runner and the runner pod.

stanxing commented 2 years ago

Are you using Runner resource only, right?

Yes ! Thank you , I will test RunnerDeployment

lukens commented 2 years ago

@stanxing Are you using Runner resource only, right? Then it seems like working as expected. It creates a runner pod that runs an ephemeral runner inside, and a Runner resource never recreate the ephemeral runner pod, because doing so result in a race condition that is intended to be fixed by ARC 0.22.x.

Other folks are using RunnerDeployment which is capable of replacing completed Runner and its corresponding runner pod with another Runner and the runner pod.

This sorted it for me too, but I had spent quite a while trying to debug this. Is it maybe worth putting a warning in the README that it'll only run one job then hang in NotReady if using a Runner rather than a RunnerDeployment?