Open extravio opened 2 years ago
@extravio Hey! Thanks for reporting. AFAIK workflow job events had no "runner_group" info in the payload when we implemented this feature, so that we can make the autoscaling decision more correct?
Just to be extra sure, what's the status
of the workflow job event you're looking into? Is it queued
, or is it in_progress
?
Hi @mumoshu , Thanks for the super quick reply.
I guess the runner_group_name would have to be in the "workflow_job.queued" for ARC to process it correctly.
However, I thought that by setting "useRunnerGroupsVisibility = true", ARC would make the necessary API calls to check what runner groups are available to the repo and scale the appropriate runners.
@extravio Thanks! Makes sense.
BTW, have you already configured ARC's github webhook server to use the secret containing the GitHub API credentials to call the necessary API for taking the runner group visibility into account?
I thought it was a helm value githubWebhookServer.authSecret.name
or something like that. If you didn't give it a secret, it would never try to call the API to determine the runner group visibility.
Hi @mumoshu
That might be the problem. The "githubWebhookServer.authSecret.name" is set to a secret defined in the webhook configuration in GitHub. I've set the "githubWebhookServer.authSecret.github_token" to the same secret as the controller manager. This PAT has admin:enterprise - manage_runners:enterprise permissions. Does "github_token" expects the actual PAT instead of a secret reference?
authSecret:
enabled: true
create: false
name: "controller-manager"
annotations: {}
[...]
githubWebhookServer:
enabled: ${githubWebhookServer_enabled}
replicaCount: 1
useRunnerGroupsVisibility: true
secret:
enabled: true
create: false
name: "github-webhook-server"
### GitHub Webhook Configuration
# github_webhook_secret_token: ""
### GitHub Apps Configuration
## NOTE: IDs MUST be strings, use quotes
#github_app_id: ""
#github_app_installation_id: ""
#github_app_private_key: |
### GitHub PAT Configuration
github_token: "controller-manager"
I can see in the webhook pod logs that the groups are identified correctly:
Found some runner groups are managed by ARC {"event": "workflow_job", [...], "groups": "RunnerGroup{Scope:Enterprise, Kind:Custom, Name:general-dev}, RunnerGroup{Scope:Enterprise, Kind:Custom, Name:test-dev}"}
[...]
Found 1 HRAs by key {"key": "enterprises/<myenterprise>/group/general-dev"}
Found 1 HRAs by key {"key": "enterprises/<myenterprise>/group/test-dev"}
Does "github_token" expects the actual PAT instead of a secret reference?
@extravio Yes!
Hi @mumoshu , I've created a new token and hard-coded the value in the Helm chart:
githubWebhookServer:
enabled: ${githubWebhookServer_enabled}
replicaCount: 1
useRunnerGroupsVisibility: true
secret:
enabled: true
create: false
name: "github-webhook-server"
github_token: "ghp_[...]"
The token does not seem to be used and pods in both runner groups are scaled randomly:
@extravio I think you need to set githubWebhookServer.secret.create to true to let it the chart create and update the secret for you. Check the existing github-webhook-server
secret- you'll see an incorrect github_token value in it.
Sorry @mumoshu , I'm a bit confused here.
In the secret github-webhook-server
, there's a copy of the secret defined in the webhook configuration in Github
(to let github authenticate with the webhook server)
I think it's correct as if I change this token, the webhook fails.
I thought the githubWebhookServer.secret.github_*
properties were for GitHub authentication (so that the webhook server could call the GitHub API and get info about the Runner Groups).
That's what I've understood from the Runner Groups section of the documentation.
@extravio I may be still missing something, that I think we're both correct.
You do need to give github_webhook_secret_token via https://github.com/actions-runner-controller/actions-runner-controller/blob/master/charts/actions-runner-controller/values.yaml#L200 if you want to verity the webhook sender with the token generated by GitHub and visible in the Manage-webhook page.
However, you also need to provide a valid GitHub API authentication info, either PAT or GitHub App credentials, via https://github.com/actions-runner-controller/actions-runner-controller/blob/master/charts/actions-runner-controller/values.yaml#L201-L207, to enable the runner group visibility feature.
Hi @mumoshu , thanks for your help. I was not able to make it work though. I've got to move on so I've decided to use labels as a work-around for now. I'll revisit it when I have a bit more time.
@extravio I may be still missing something, that I think we're both correct.
You do need to give github_webhook_secret_token via https://github.com/actions-runner-controller/actions-runner-controller/blob/master/charts/actions-runner-controller/values.yaml#L200 if you want to verity the webhook sender with the token generated by GitHub and visible in the Manage-webhook page.
However, you also need to provide a valid GitHub API authentication info, either PAT or GitHub App credentials, via https://github.com/actions-runner-controller/actions-runner-controller/blob/master/charts/actions-runner-controller/values.yaml#L201-L207, to enable the runner group visibility feature.
what permission should I set here for this PAT?
Checks
Controller Version
v0.25.2
Helm Chart Version
0.20.2
CertManager Version
1.8
Deployment Method
Helm
cert-manager installation
correct
Checks
Resource Definitions
To Reproduce
Describe the bug
As the test organization only has access to the "test-dev" runner group, the test dev runners shoud be scaled up. Instead, the general-dev runners are scaled up: (ignore the general-win-dev runners)
Workflows are processed 1 by 1 by the single test-dev pod.
The payload of the webhook show the correct runner group
Describe the expected behavior
"test-dev" pods would scale up and workflows would be processed in parallel
Controller Logs
Runner Pod Logs
Additional Context
No response