Open Fleshgrinder opened 2 years ago
@Fleshgrinder Hey!
The .path file
The whole runsvc.sh
is mostly a copy of that from upstream: https://github.com/actions/runner/blob/main/src/Misc/layoutbin/runsvc.sh
so
The line that states that this is the place to customize the environment is for sure well intended
Maybe, but to be clear that's designed by GitHub, for by us, so we might better think about own use-case independently.
Sure thing, I was only running down the rabbit hole. My expectation was that /entrypoint.sh
provides the customization functionality, but it does not provide anything either.
If all you're trying to do is have some environment variables baked into your runner and loaded by the serivce you just need a .env
file at the root of the service path in the format of key=value
. The runner doesn't load other env files specifically so you have a deterministic env.
/etc/environment
I'm not entirely sure if that's the single right way for us. Does every operating system supports it, and does it make sense within the context of containers? 🤔
Also, regardless of if /etc/environment is right or not, I'd also like to keep it simple so that I can maintain it in our limited time and resource. While I'd like to help as many folks as an open-source enthusiast, adding anything that complicates the maintenance and user support in the exchange for little benefit should be avoided.
One idea- Can you build a custom runner image, renaming the standard entrypoint.sh into another name, include a custom entrpoint.sh at /entrypoint.sh and do whatever you want there? The custom entrypoint.sh could exec
the original entrypoint.sh so that it works like the original entrpoint.sh in the end.
Or, what if we enhanced our entrypoint.sh to automatically source some bash scripts under a predefined path, so that you could add your own script that reads /etc/environment
and export every environment defined in it, or something like that?
Would it work for your use-case?
Ah, never mind if .env
works. It's just that I wasn't aware of it!
@toast-gear that is the goal, yes. What is the “root of the service path?” Hopefully not /runner
since .env
would not be copied over from /runnertmp
. 😐
@mumoshu anything works, /etc/environment
is just a standard way of doing this in some operating systems and I picked it as a possible solution, but it was not meant to be the only solution.
Changing the CMD
from /entrypoint.sh
to something else would be my workaround, but it would be great to have a built-in functionality that does not require the addition of such drastic measures. ☺️
@Fleshgrinder Thanks!
Just confirming, but will my second idea work?
Or, what if we enhanced our entrypoint.sh to automatically source some bash scripts under a predefined path, so that you could add your own script that reads /etc/environment and export every environment defined in it, or something like that? Would it work for your use-case?
Imo adjusting https://github.com/actions-runner-controller/actions-runner-controller/blob/1b911749a65ed6b767c2320c7eba9f34b53408e2/runner/entrypoint.sh#L83 to also copy a .env
file and not error if there isn't one is probably the least confusing least hacky fix. We aren't changing expected actions/runner
behaviour then
To generally copy dot files from /runnertmp
we could simply tell Bash to do so:
shopt -s dotglob
cp -r /runnertmp/* "$RUNNER_HOME/"
shopt -u dotglob
Additionally it is probably a good idea to directly change it further to actually use the environment variable that defines the /runnertmp
path. Otherwise having that variable is confusing since it does not actually affect anything.
shopt -s dotglob
cp -r "$RUNNER_ASSETS_DIR"/* "$RUNNER_HOME"/
shopt -u dotglob
Sounds reasonable to me, give it a try and raise a PR if it works. Either way let us know your results.
What I can already report back is that the content of /etc/environment
is available whenever sudo
is used, but not for the normal runner
user. This is because sudo
honors the standard loading of environment variables in accordance with the operating system standards. I get the feeling that the real problem is with LoadAndSetEnv
of https://github.com/actions/runner since it should honor that as well.
We should still add the dotfile copying, since it is the current way it should work, but I am going to create a ticket with GitHub to clarify this.
Another thing to report is that the PATH
is not set correctly for the root user, as it contains the .local/bin
directory from the runner
user:
$ env | grep -F PATH
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/runner/.local/bin
$ sudo env | grep -F PATH
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/runner/.local/bin
The :/home/runner/.local/bin
should not be part of the root’s PATH
for obvious reasons. The reason for this seems to be here:
Although I am not entirely sure, because ${HOME}
for root
should not point to /home/runner
but rather /root
. Plus, the DEBIAN_FRONTEND=noninteractive
environment variable is also set in the Dockerfile
but only visible in the runner
environment (not in the Dockerfile
RUN
s themselves, because the user is switched later, but in the final image):
I assume it has something to do with the .path
file that is created somewhere by something, could not pinpoint its source yet. Maybe you have an idea?
There definitely is this line that loads the .path
variable and exports it, but how this can become the path for root
is beyond me, because sudo
should not inherit it.
I extended https://github.com/actions-runner-controller/actions-runner-controller/pull/1136 with loading /etc/environment
the same way as PAM does in Ubuntu for all users. Since that is the behavior that we have in GitHub’s virtual-environments, see https://github.com/actions/runner/issues/1703#issuecomment-1050292326 and the following comments for more context.
I could not resolve the PATH
issue yet, will dig deeper. That said, we could merge the pull request as-is and follow up with additional pull requests later to fix the other environment issues.
Another issue with the environment is that all the ARC specific environment variables are leaking into the environment of the users:
$ printenv
ACTIONS_RUNNER_CONTROLLER_GITHUB_WEBHOOK_SERVER_PORT=tcp://1.1.1.1:33080 ACTIONS_RUNNER_CONTROLLER_GITHUB_WEBHOOK_SERVER_PORT_33080_TCP: tcp://1.1.1.1:33080 ACTIONS_RUNNER_CONTROLLER_GITHUB_WEBHOOK_SERVER_PORT_33080_TCP_ADDR: 1.1.1.1 ACTIONS_RUNNER_CONTROLLER_GITHUB_WEBHOOK_SERVER_PORT_33080_TCP_PORT: 33080 ACTIONS_RUNNER_CONTROLLER_GITHUB_WEBHOOK_SERVER_PORT_33080_TCP_PROTO: tcp ACTIONS_RUNNER_CONTROLLER_GITHUB_WEBHOOK_SERVER_PORT_8443_TCP: tcp://1.1.1.1:8443 ACTIONS_RUNNER_CONTROLLER_GITHUB_WEBHOOK_SERVER_PORT_8443_TCP_ADDR: 1.1.1.1 ACTIONS_RUNNER_CONTROLLER_GITHUB_WEBHOOK_SERVER_PORT_8443_TCP_PORT: 8443 ACTIONS_RUNNER_CONTROLLER_GITHUB_WEBHOOK_SERVER_PORT_8443_TCP_PROTO: tcp ACTIONS_RUNNER_CONTROLLER_GITHUB_WEBHOOK_SERVER_SERVICE_HOST=1.1.1.1 ACTIONS_RUNNER_CONTROLLER_GITHUB_WEBHOOK_SERVER_SERVICE_PORT=33080 ACTIONS_RUNNER_CONTROLLER_GITHUB_WEBHOOK_SERVER_SERVICE_PORT_HTTP=33080 ACTIONS_RUNNER_CONTROLLER_GITHUB_WEBHOOK_SERVER_SERVICE_PORT_METRICS_PORT=8443 ACTIONS_RUNNER_CONTROLLER_METRICS_SERVICE_PORT=tcp://1.1.1.1:8443 ACTIONS_RUNNER_CONTROLLER_METRICS_SERVICE_PORT_8443_TCP: tcp://1.1.1.1:8443 ACTIONS_RUNNER_CONTROLLER_METRICS_SERVICE_PORT_8443_TCP_ADDR: 1.1.1.1 ACTIONS_RUNNER_CONTROLLER_METRICS_SERVICE_PORT_8443_TCP_PORT: 8443 ACTIONS_RUNNER_CONTROLLER_METRICS_SERVICE_PORT_8443_TCP_PROTO: tcp ACTIONS_RUNNER_CONTROLLER_METRICS_SERVICE_SERVICE_HOST=1.1.1.1 ACTIONS_RUNNER_CONTROLLER_METRICS_SERVICE_SERVICE_PORT=8443 ACTIONS_RUNNER_CONTROLLER_METRICS_SERVICE_SERVICE_PORT_METRICS_PORT=8443 ACTIONS_RUNNER_CONTROLLER_WEBHOOK_PORT=tcp://1.1.1.1:443 ACTIONS_RUNNER_CONTROLLER_WEBHOOK_PORT_443_TCP: tcp://1.1.1.1:443 ACTIONS_RUNNER_CONTROLLER_WEBHOOK_PORT_443_TCP_ADDR: 1.1.1.1 ACTIONS_RUNNER_CONTROLLER_WEBHOOK_PORT_443_TCP_PORT: 443 ACTIONS_RUNNER_CONTROLLER_WEBHOOK_PORT_443_TCP_PROTO: tcp ACTIONS_RUNNER_CONTROLLER_WEBHOOK_SERVICE_HOST=1.1.1.1 ACTIONS_RUNNER_CONTROLLER_WEBHOOK_SERVICE_PORT=443 ACTIONS_RUNNER_CONTROLLER_WEBHOOK_SERVICE_PORT_HTTPS=443 AWS_DEFAULT_REGION=redacted AWS_REGION=redacted AWS_ROLE_ARN=arn:aws:iam::redacted:role/redacted AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token CI=true CONTROLLER_MANAGER_METRICS_SERVICE_PORT=tcp://1.1.1.1:8443 CONTROLLER_MANAGER_METRICS_SERVICE_PORT_8443_TCP: tcp://1.1.1.1:8443 CONTROLLER_MANAGER_METRICS_SERVICE_PORT_8443_TCP_ADDR: 1.1.1.1 CONTROLLER_MANAGER_METRICS_SERVICE_PORT_8443_TCP_PORT: 8443 CONTROLLER_MANAGER_METRICS_SERVICE_PORT_8443_TCP_PROTO: tcp CONTROLLER_MANAGER_METRICS_SERVICE_SERVICE_HOST=1.1.1.1 CONTROLLER_MANAGER_METRICS_SERVICE_SERVICE_PORT=8443 CONTROLLER_MANAGER_METRICS_SERVICE_SERVICE_PORT_HTTPS=8443 DEBIAN_FRONTEND=noninteractive DOCKERD_IN_RUNNER=false DOCKER_CERT_PATH=/certs/client DOCKER_CONFIG=/etc/docker DOCKER_HOST=tcp://localhost:2376 DOCKER_TLS_VERIFY=1 GH_NO_UPDATE_NOTIFIER=true GITHUB_ACTION=__run_2 GITHUB_ACTIONS=true GITHUB_ACTION_REF= GITHUB_ACTION_REPOSITORY= GITHUB_ACTOR=redacted GITHUB_API_URL=https://api.github.com/ GITHUB_BASE_REF=redacted GITHUB_ENV=/runner/_work/_temp/_runner_file_commands/set_env_00000000-0000-0000-0000-000000000000 GITHUB_EVENT_NAME=pull_request GITHUB_EVENT_PATH=/runner/_work/_temp/_github_workflow/event.json GITHUB_GRAPHQL_URL=https://api.github.com/graphql GITHUB_HEAD_REF=redacted GITHUB_JOB=test GITHUB_PATH=/runner/_work/_temp/_runner_file_commands/add_path_00000000-0000-0000-0000-000000000000 GITHUB_REF=refs/pull/0/merge GITHUB_REF_NAME=0/merge GITHUB_REF_PROTECTED=false GITHUB_REF_TYPE=branch GITHUB_REPOSITORY=redacted/redacted GITHUB_REPOSITORY_OWNER=redacted GITHUB_RETENTION_DAYS=90 GITHUB_RUN_ATTEMPT=0 GITHUB_RUN_ID=0 GITHUB_RUN_NUMBER=0 GITHUB_SERVER_URL=https://github.com/ GITHUB_SHA=0000000000000000000000000000000000000000 GITHUB_URL=https://github.com/ GITHUB_WORKFLOW=redacted GITHUB_WORKSPACE=/runner/_work/redacted/redacted HOME=/home/runner HOSTNAME=redacted ImageOS=ubuntu20 JAVA_HOME_11_X64: /usr/lib/jvm/temurin-11-jdk-amd64 JAVA_HOME_17_X64: /usr/lib/jvm/temurin-17-jdk-amd64 JAVA_HOME_8_X64: /usr/lib/jvm/temurin-8-jdk-amd64 KUBERNETES_PORT=tcp://1.1.1.1:443 KUBERNETES_PORT_443_TCP: tcp://1.1.1.1:443 KUBERNETES_PORT_443_TCP_ADDR: 1.1.1.1 KUBERNETES_PORT_443_TCP_PORT: 443 KUBERNETES_PORT_443_TCP_PROTO: tcp KUBERNETES_SERVICE_HOST=1.1.1.1 KUBERNETES_SERVICE_PORT=443 KUBERNETES_SERVICE_PORT_HTTPS=443 OLDPWD=/ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/runner/.local/bin PWD=/runner/_work/redacted/redacted RUNNER_ARCH=X64 RUNNER_ASSETS_DIR=/runnertmp RUNNER_ENTERPRISE= RUNNER_EPHEMERAL=true RUNNER_FEATURE_FLAG_EPHEMERAL=true RUNNER_GROUP= RUNNER_LABELS=redacted RUNNER_NAME=redacted RUNNER_ORG=redacted RUNNER_OS=Linux RUNNER_TEMP=/runner/_work/_temp RUNNER_TOOL_CACHE=/opt/hostedtoolcache RUNNER_TRACKING_ID=github_00000000-0000-0000-0000-000000000000 RUNNER_WORKDIR=/runner/_work RUNNER_WORKSPACE=/runner/_work/redacted SHLVL=0 STARTUP_DELAY_IN_SECONDS=3 WEBHOOK_SERVICE_PORT=tcp://1.1.1.1:443 WEBHOOK_SERVICE_PORT_443_TCP: tcp://1.1.1.1:443 WEBHOOK_SERVICE_PORT_443_TCP_ADDR: 1.1.1.1 WEBHOOK_SERVICE_PORT_443_TCP_PORT: 443 WEBHOOK_SERVICE_PORT_443_TCP_PROTO: tcp WEBHOOK_SERVICE_SERVICE_HOST=1.1.1.1 WEBHOOK_SERVICE_SERVICE_PORT=443 _=/usr/bin/env
I am not entirely sure which environment variables are the result of our doing (probably the AWS ones) and which ones are a result of ARC. I guess all the ACTIONS_RUNNER_CONTROLLER_*
, CONTROLLER_MANAGER_*
, KUBERNETES_*
, RUNNER_*
, STARTUP_DELAY_IN_SECONDS
, and WEBHOOK_SERVICE_*
are from ARC and should not be there.
ACTIONS_RUNNERCONTROLLER, CONTROLLERMANAGER, KUBERNETES*, RUNNER, STARTUP_DELAY_IN_SECONDS, and WEBHOOKSERVICE are from ARC and should not be there.
@Fleshgrinder Everything listed here, except RUNNER_*
, are populated by K8s and should be available to any containers in a K8s pod. Those are not ARC-specific and useful for standard K8s use-cases like using envvars for service-discovery. Perhaps the default behavior should be to expose those envvars, although it can be a good idea to make it togglable via a feature flag to the runner container 🤔
RUNNER_*
are populated by ARC and should be used only from within runner's entrypoint.sh
. Perhaps we can unset
(or is there a better way?) those envvars as shouldn't be visible to jobs?
Does anyone really desire workflows that are aware of being executed as part of ARC? I mean, it makes the workflows unportable. A toggle would be a good idea here.
unset
is possible. The issue I see in general with environment variables as we are using them is that users might set environment variables with the same names as part of their workflow. This does not interfere with the runners, since user environment variables are set up later, but if a software within a workflow also has an environment variable that accidentally matches one of the more generic ARC environment variables it can lead to weird behavior. Obscuring the environment variables is a possibility here:
DOCKERD_IN_RUNNER
we could have __ARC_DOCKERD_IN_RUNNER
STARTUP_DELAY_IN_SECONDS
we could have __ARC_STARTUP_DELAY_IN_SECONDS
RUNNER_*
we could have __ARC_RUNNER_*
This would allow us to clearly state that no environment variables starting with __ARC_
are allowed, and within entrypoint.sh
we could have:
while read -r var; do
if [[ "$var" == __ARC_* ]]; then
unset "${var%%=*}"
fi
done < <(env)
Even better would be to pass things by arguments to the runners. This way there is nothing to unset
, nothing to leak, and it is thus much safer in general.
environment variables with the same names as part of their workflow
Everything we explicitly use in entrypoint.sh should not be leaked to workflows so I thought you can just unset every known RUNNER_* vars and other runner-specific envvars we explicitly pass down from the controller, before finally exec'ing runsvc.sh
, in our entrypoint.sh.
Why do you need to prefix the envvars with __ARC_
? 🤔 Your explanation makes sense, but as you've written you intended to obfuscure our own envvars like DOCKERD_IN_RUNNER
, STARTUP_DELAY_IN_SECONDS
, and other RUNNER_* envvars, right? That seems possible without __ARC_
. Does it have any concrete benefit other than its being very elegant?
Does anyone really desire workflows that are aware of being executed as part of ARC? I mean, it makes the workflows unportable. A toggle would be a good idea here.
This is about K8s-populated envvars, right? I'd hope any k8s pod containers to have standard k8s-populated envvars available to you, because, that's the portability K8s provides. If you weigh more on the portability of workflows, yes, your idea makes sense a lot.
A toggle would be nice, if we see real issues with those K8s-populated envvars.
Main reason for the __ARC_
really is about the ability to reliably unset
them without having anyone remember that they need to edit entrypoint.sh
. It is good practice to have a common prefix for environment variables in general. We could of course also use RUNNER_
as that prefix and achieve the same. (I originally thought that users can interact with the environment variables through their workflows, but disproved this by trying it. So there is no real reason to obfuscate.)
Well ACTIONS_RUNNER_CONTROLLER_*
and CONTROLLER_MANAGER_METRICS_SERVICE_*
do not look like standard Kubernetes to me, rather specific to ARC. Anyway, a GHA workflow should not know whether it is running on Kubernetes, or not. Well, generally speaking, since there are parts where it truly makes a difference (e.g. a lot of our authentication is done automagically through AWS and Kubernetes stuff, this would obviously break if we were to execute it anywhere other than on our ARC setup).
I have no strong feelings here regarding anything. I see positives and negatives, but we should definitely clean the environment as much as possible to ensure that we are not leaking things that truly are of no use (those you mentioned). Having a common prefix would allow us to handle them generically. As it stands right now we could have the following:
unset DOCKERD_IN_RUNNER STARTUP_DELAY_IN_SECONDS
while read -r var; do
if [[ "$var" == RUNNER_* ]]; then
unset "${var%%=*}"
fi
done < <(env)
Note that we can change the environment variables without breaking anything:
#!/usr/bin/env bash
set -Eeuo pipefail
# region backcomp
export RUNNER_DOCKERD_IN_RUNNER=${DOCKERD_IN_RUNNER:-false}
unset DOCKERD_IN_RUNNER
export RUNNER_STARTUP_DELAY_IN_SECONDS=${STARTUP_DELAY_IN_SECONDS:-0}
unset STARTUP_DELAY_IN_SECONDS
# endregion backcomp
# ... snip ...
while read -r var; do
if [[ "$var" == RUNNER_* ]]; then
unset "${var%%=*}"
fi
done < <(env)
With this we could give users enough time to change before we remove the backcomp code.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
There still is the issue with the PATH
that is somehow polluted for the root user. I didn't find out yet why that is.
If all you're trying to do is have some environment variables baked into your runner and loaded by the serivce you just need a
.env
file at the root of the service path in the format ofkey=value
. The runner doesn't load other env files specifically so you have a deterministic env.
Can you please clarify this ? I've tried putting a .env file at / and /runner to no avail .
Edit: figured it out , its /runnertmp/.env
/runnertmp/.env
is there any documentation on this? how can we ensure this file exists? i'm using helm + terraform to deploy this on GKE...
/runnertmp/.env
is there any documentation on this? how can we ensure this file exists? i'm using helm + terraform to deploy this on GKE...
AFAIK, no. You could use an init container if they're simple, or use the existing runner images and build off of them, COPY
'ing in your .env file as a final step in your own Dockerfiles. We have forked the full set of runner dockerfiles/associated files into our own source control and modify them directly to have full control over the runners and how they're built and what's included.
In runner directory. Create a .env file and add PATH=/usr/bin:/bin:/sbin:/usr/sbin:/usr/local/bin Add other paths also if required And save the file. Then restart the runner service. This worked for us!
We are extending the runner base image to provision additional software similar to what GitHub does in their virtual environments. These provisioners also need to set up additional environment variables dynamically since they install software with dynamic versions. The natural choice to do so is by adding those environment variables to
/etc/environment
. However, this file is never loaded by the runners.Hence, we set out to find out what is going on, and it seems that there is no reliable way to actually do this, other than manipulating files of the base image. This is obviously brittle, since those files can change at any time and are not standardized like the files of the operating system (e.g.
/etc/environment
). Let me explain what we found.The entrypoint is set to
/usr/local/bin/dumb-init
and the initial command is set to/entrypoint.sh
. The/entrypoint.sh
script copies all non-dot files from/runnertmp
to/runner
.^cpr Later it replaces the standardrunsvc.sh
with a pachtedrunsvc.sh
.^patched This file contains the following lines:https://github.com/actions-runner-controller/actions-runner-controller/blob/0b9bef2c086a54b0731fa7aa0a6bd7cb856b48ee/runner/patched/runsvc.sh#L7-L13
The
.path
file can never exist (unless it would be created by some of the commands that are being called in/entrypoint.sh
) since dot files are not copied over to/runner
. The line that states that this is the place to customize the environment is for sure well intended, but assumes that editing alien scripts is done by a human, which is not the case for us. Customizing it with automation is – as stated before – a bad idea since it may break a lot of things at any point in time.Imho it would be best to simply read
/etc/environment
since that is what it is intended for, but maybe there are good reasons not to do so?