actions / actions-runner-controller

Kubernetes controller for GitHub Actions self-hosted runners
Apache License 2.0
4.69k stars 1.11k forks source link

`containerJob: kubernetes` runs into `ECONNREFUSED 127.0.0.1:8080` #2547

Open Ravio1i opened 1 year ago

Ravio1i commented 1 year ago

Checks

Controller Version

v0.27.3

Helm Chart Version

0.23.2

CertManager Version

1.11

Deployment Method

Helm

cert-manager installation

I followed the installation guide and insatlled cert-manager from https://cert-manager.io/docs/installation/

Checks

Resource Definitions

apiVersion: v1
kind: ServiceAccount
metadata:
  name: ${prefix}${stage}-${cloud}-${APP_NAME}-lcc
  namespace: default
automountServiceAccountToken: false
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: ${prefix}${stage}-${cloud}-${APP_NAME}-lcc
  namespace: default
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "create", "delete"]
- apiGroups: [""]
  resources: ["pods/exec"]
  verbs: ["get", "create"]
- apiGroups: [""]
  resources: ["pods/log"]
  verbs: ["get", "list", "watch",]
- apiGroups: ["batch"]
  resources: ["jobs"]
  verbs: ["get", "list", "create", "delete"]
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get", "list", "create", "delete"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: runner-role-binding
  namespace: default
subjects:
  - kind: ServiceAccount
    name: ${prefix}${stage}-${cloud}-${APP_NAME}-lcc
roleRef:
  kind: Role
  name: runner-role
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: ${prefix}${stage}-${cloud}-${APP_NAME}-lcc
  namespace: default
spec:
  template:
    spec:
      dockerEnabled: false
      dockerdWithinRunnerContainer: false
      ephemeral: false
      containerMode: kubernetes
      organization: ${APP_NAME}
      group: ${prefix}${stage}-${cloud}-${APP_NAME}-lcc-1
      image: summerwind/actions-runner:ubuntu-22.04
      labels:
        - container
        - arc
        - container-controller
      containers:
      - name: runner
        resources:
          requests:
            cpu: "1"
            memory: "1Gi"
          limits:
            cpu: "2"
            memory: "2Gi"
        env:
        - name: http_proxy
          value: ${HTTP_PROXY}
        - name: https_proxy
          value: ${HTTP_PROXY}
        - name: no_proxy
          value: ${NO_PROXY}
        - name: ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER
          value: "true"
        - name: ACTIONS_RUNNER_KUBERNETES_NAMESPACE
          value: default
      imagePullSecrets:
        - name: artifactory
      serviceAccountName: ${prefix}${stage}-${cloud}-${APP_NAME}-lcc
      workVolumeClaimTemplate:
        storageClassName: "encrypted-standard"
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi

To Reproduce

1. Run any workflow with `container` syntax with deployed runnerdeployment

Describe the bug

The containerMode: kubernetes does not work.

##[debug]Evaluating condition for step: 'Stop containers'
##[debug]Evaluating: always()
##[debug]Evaluating always:
##[debug]=> true
##[debug]Result: true
##[debug]Starting: Stop containers
Run '/runner/k8s/index.js'
  shell: /runner/externals/node16/bin/node {0}
##[debug]/runner/externals/node16/bin/node /runner/k8s/index.js
Error: Error: connect ECONNREFUSED 1[2](https://***/GithubRunnerTest/actions-lab/actions/runs/43898/jobs/131645#step:7:2)[7](https://**/GithubRunnerTest/actions-lab/actions/runs/43898/jobs/131645#step:7:7).0.0.1:[8](https://***/GithubRunnerTest/actions-lab/actions/runs/43898/jobs/131645#step:7:8)080
Error: Process completed with exit code 1.
Error: Executing the custom container implementation failed. Please contact your self hosted runner administrator.
##[debug]System.Exception: Executing the custom container implementation failed. Please contact your self hosted runner administrator.
##[debug] ---> System.Exception: The hook script at '/runner/k8s/index.js' running command 'CleanupJob' did not execute successfully
##[debug]   at GitHub.Runner.Worker.Container.ContainerHooks.ContainerHookManager.ExecuteHookScript[T](IExecutionContext context, HookInput input, ActionRunStage stage, String prependPath)
##[debug]   --- End of inner exception stack trace ---
##[debug]   at GitHub.Runner.Worker.Container.ContainerHooks.ContainerHookManager.ExecuteHookScript[T](IExecutionContext context, HookInput input, ActionRunStage stage, String prependPath)
##[debug]   at GitHub.Runner.Worker.Container.ContainerHooks.ContainerHookManager.CleanupJobAsync(IExecutionContext context, List`1 containers)
##[debug]   at GitHub.Runner.Worker.ContainerOperationProvider.StopContainersAsync(IExecutionContext executionContext, Object data)
##[debug]   at GitHub.Runner.Worker.JobExtensionRunner.RunAsync()
##[debug]   at GitHub.Runner.Worker.StepsRunner.RunStepAsync(IStep step, CancellationToken jobCancellationToken)
##[debug]Finishing: Stop containers

I run the actions-runner-controller behind a corporate proxy. The goal is to get the containerMode: kuberentes running. However, I run in any approach into the above error.

Describe the expected behavior

The runner shall create another pod to run the job in.

Whole Controller Logs

https://gist.github.com/Ravio1i/7f95077937ec15a9a4f3dc10ec64f789

Whole Runner Pod Logs

https://gist.github.com/Ravio1i/de9b92bded91aa0f61aa5cd58d19da1e

Additional Context

A "normal" runner without containerMode: kubernetes is working fine

helm value.s.yml

githubEnterpriseServerURL: ***

resources:
  limits:
    cpu: 100m
    memory: 128Mi
  requests:
    cpu: 100m
    memory: 128Mi

env:
  http_proxy: ***
  https_proxy: ***
  no_proxy: "localhost,127.0.0.1,***"
github-actions[bot] commented 1 year ago

Hello! Thank you for filing an issue.

The maintainers will triage your issue shortly.

In the meantime, please take a look at the troubleshooting guide for bug reports.

If this is a feature request, please review our contribution guidelines.