Closed Srikanth1992 closed 2 years ago
Hi @Srikanth1992,
Issue 1: Is that a secret generated by the terraform module? I can see how it is unfortunate, but right now the easiest would be to generate/use another secret that doesn't have the sequence of '\u
Issue 2: You can see in the scale down logs that the runner is marked as orphan and therefore terminated. This would indicate the runner is not registered in GitHub. Can you check the log group with the runner logs to see if something went wrong registering the runner? By default the Cloudwatch log group would be named /github-self-hosted-runners/<environment>/runners
(otherwise check the terraform variable runner_log_files
)
Let me know where to adjust the scale down time out interval in lambda function. so that when runner can wait till atleast 20 min and them scale up down should happen.
This can be done via the minimum_running_time_in_minutes
terraform variable.
Hope this helps!
Hi @Srikanth1992,
Issue 1: Is that a secret generated by the terraform module? I can see how it is unfortunate, but right now the easiest would be to generate/use another secret that doesn't have the sequence of '\u'
Issue 2: You can see in the scale down logs that the runner is marked as orphan and therefore terminated. This would indicate the runner is not registered in GitHub. Can you check the log group with the runner logs to see if something went wrong registering the runner? By default the Cloudwatch log group would be named
/github-self-hosted-runners/<environment>/runners
(otherwise check the terraform variablerunner_log_files
)Let me know where to adjust the scale down time out interval in lambda function. so that when runner can wait till atleast 20 min and them scale up down should happen.
This can be done via the
minimum_running_time_in_minutes
terraform variable.Hope this helps!
Hi @gertjanmaas @npalm
Good News, I got the### Issue 2 fixed. I'm still having trouble in ### Issue 1
When ever I create a resources in AWS the webhook secret, it's coming with \u003\ *sr7829\u003chdwiorn** and getting saved in AWS SSM parameter store without \u003 character(replaced by > or < which is causing 401 error in github webhook payload console . Can you please let me on how to create a webhhook password without those letters/characters while building the resources in AWS.
I'm manually updating the password full secret in AWS SSM parameter for webhook, then it's working.
Can you please let me know how to overcome this issue
Thanks, Srikanth
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.
Hello @npalm
I'm using latest version v0.18.0 and I was your terraform scripts to deploy the github runners from ubuntu example.
In our environment we need to use our own vpc, subnets so I have removed the vpc.tf from ubuntu and I've hardcoded the vpc id, subnet id & security group id, GHE URLS in modules (runners,scale up & scale down).
Configure github App to look for chec_run event.
Now when we do terraform apply, all the resources are getting created and secrets are getting stored in AWS SSM parameter store.
Issue 1:
When AWS ssm is storing the secret of webhook, it's replace \u003c with special charcter ### " >" .
For example if my password is sr7829\u003chdwiorn but in AWS ssm parameter store secret is stored as sr7829>chdwiorn and it causing 401 error for webhook payloads.
Issue 2:
Note when our instance need to run our userdata and then userdata from your scripts. Runner instance is able to run our userdata and but it's didn't wait to run the userdata specified in launchtemplate.
### USERDATA In my launchtemplate
!/bin/bash -e
exec > >(tee /var/log/user-data.log | logger -t user-data -s 2>/dev/console) 2>&1 /root/.deploy.sh yum update -y
yum install amazon-cloudwatch-agent -y
amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c ssm:github-self-hosted-runner-cloudwatch_agent_config_runner
#
Install docker
amazon-linux-extras install docker service docker start usermod -a -G docker ec2-user yum install -y curl jq git
USER_NAME=ec2-user cd /home/$USER_NAME mkdir actions-runner && cd actions-runner etc etc....,,,
Whenever github action is triggered scaleup lambda is creating an instance:
Cloudwatch logs from scaleup lambda
2021-08-27T17:07:37.940Z 913f0f9c-2600-5ee7-b8e9-563f3803d0c6 DEBUG https://GITURL/api/v3
2021-08-27T17:07:39.678Z 913f0f9c-2600-5ee7-b8e9-563f3803d0c6 INFO Org GITHUB Repo name has 0/3 runners 2021-08-27T17:07:39.910Z 913f0f9c-2600-5ee7-b8e9-563f3803d0c6 INFO Attempting to launch instance using github-self-hosted-runner-action-runner-m5.large. 2021-08-27T17:07:39.915Z 913f0f9c-2600-5ee7-b8e9-563f3803d0c6 DEBUG Runner configuration: { "environment": "github-self-hosted-runner", "runnerServiceConfig": "--url https://GTHUBURL --token djsjshrkjhfwedwqedlehdlb --labels ubuntu,example,self-hosted --runnergroup Default", "runnerOwner": "xxxxxxx", "runnerType": "Org" } 2021-08-27T17:07:42.136Z 913f0f9c-2600-5ee7-b8e9-563f3803d0c6 INFO Created instance(s): .i-070cd607594b802cb END RequestId: 913f0f9c-2600-5ee7-b8e9-563f3803d0c6
After the Instance is launched it is getting terminated immediately and cloudwatch logs are not giving much information.
Cloudwatch logs from scale down lambda
START RequestId: 3f809400-9144-4eea-803e-4665301e603e Version: $LATEST 2021-08-27T17:15:11.954Z 3f809400-9144-4eea-803e-4665301e603e DEBUG [createGitHubClientForRunner] Cache miss for GITHUBORGNAME 2021-08-27T17:15:12.876Z 3f809400-9144-4eea-803e-4665301e603e DEBUG https://GITURL/api/v3 2021-08-27T17:15:14.008Z 3f809400-9144-4eea-803e-4665301e603e DEBUG https://GITURL/api/v3 2021-08-27T17:15:14.354Z 3f809400-9144-4eea-803e-4665301e603e DEBUG [listGithubRunners] Cache miss for GITHUBORG 2021-08-27T17:15:14.603Z 3f809400-9144-4eea-803e-4665301e603e INFO Runner 'i-070cd607594b802cb' is orphan, and will be removed. 2021-08-27T17:15:14.894Z 3f809400-9144-4eea-803e-4665301e603e DEBUG Runner terminated.i-070cd607594b802cb
Please let me know if you need more information.
Let me know where to adjust the scale down time out interval in lambda function. so that when runner can wait till atleast 20 min and them scale up down should happen.