actions / actions-runner-controller

Kubernetes controller for GitHub Actions self-hosted runners
Apache License 2.0
4.75k stars 1.12k forks source link

Runner Scale Set gets stuck, crash loops every 3 seconds: Getting next message, request failed #3813

Open nieldejonghe opened 20 hours ago

nieldejonghe commented 20 hours ago

Checks

Controller Version

0.9.3

Deployment Method

Helm

Checks

To Reproduce

1. Install and configure both the controller and scale set helm charts in an EKS Cluster
2. Observe the listener pods crash looping

Describe the bug

Scale set is visible in the UI but appears offline, the listeren pod gives following errors:

2024-11-19T11:51:37Z ERROR listener-app Retryable client error {"error": "Get \"https://pipelinesghubeus6.actions.githubusercontent.com/Uu6gCbJC0pkBVG8hJQRVfdroi3Ll2t2r28Ye4iejANsDcEiM8m/_apis/runtime/runnerscalesets/3/messages?sessionId=62a49aab-c887-4f96-a9ec-4da8e088ab84&api-version=6.0-preview\": context canceled", "method": "GET", "url": "https://pipelinesghubeus6.actions.githubusercontent.com/Uu6gCbJC0pkBVG8hJQRVfdroi3Ll2t2r28Ye4iejANsDcEiM8m/_apis/runtime/runnerscalesets/3/messages?sessionId=62a49aab-c887-4f96-a9ec-4da8e088ab84&api-version=6.0-preview", "error": "request failed"} github.com/actions/actions-runner-controller/github/actions.(*clientLogger).Error github.com/actions/actions-runner-controller/github/actions/client.go:76 github.com/hashicorp/go-retryablehttp.(*Client).Do github.com/hashicorp/go-retryablehttp@v0.7.7/client.go:718 github.com/hashicorp/go-retryablehttp.(*RoundTripper).RoundTrip github.com/hashicorp/go-retryablehttp@v0.7.7/roundtripper.go:47 net/http.send net/http/client.go:259 net/http.(*Client).send net/http/client.go:180 net/http.(*Client).do net/http/client.go:724 net/http.(*Client).Do net/http/client.go:590 github.com/actions/actions-runner-controller/github/actions.(*Client).Do github.com/actions/actions-runner-controller/github/actions/client.go:273 github.com/actions/actions-runner-controller/github/actions.(*Client).GetMessage github.com/actions/actions-runner-controller/github/actions/client.go:577 github.com/actions/actions-runner-controller/cmd/ghalistener/listener.(*Listener).getMessage github.com/actions/actions-runner-controller/cmd/ghalistener/listener/listener.go:272 github.com/actions/actions-runner-controller/cmd/ghalistener/listener.(*Listener).Listen github.com/actions/actions-runner-controller/cmd/ghalistener/listener/listener.go:163 github.com/actions/actions-runner-controller/cmd/ghalistener/app.(*App).Run.func1 github.com/actions/actions-runner-controller/cmd/ghalistener/app/app.go:124 golang.org/x/sync/errgroup.(*Group).Go.func1 golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 2024-11-19T11:51:37Z INFO listener-app.listener Deleting message session

Describe the expected behavior

Listener should start correctly

Additional Context

The values.yaml used is as close to the upstream documentation as possible.

I am using a Github App for authentication and passing the github app configuration via a pre-defined secret

Controller Logs

https://gist.github.com/nieldejonghe/352cda282c90be9f7ea2b7e817a09a65

Runner Pod Logs

https://gist.github.com/nieldejonghe/fd51ac20366878f1f0765aa8edb92b5b
github-actions[bot] commented 20 hours ago

Hello! Thank you for filing an issue.

The maintainers will triage your issue shortly.

In the meantime, please take a look at the troubleshooting guide for bug reports.

If this is a feature request, please review our contribution guidelines.

nieldejonghe commented 17 hours ago

Seems to be working when using version 0.9.1. Both 0.9.2 and 0.9.3 have the same issue as described above FYI using EKS version v1.31.0-eks-a737599