Open dillon-cullinan opened 2 months ago
To get this working, there are a couple issues that had to be fixed. There is a typo in the provided chart in the docs:
ash
should of course be bash
.
Secondly, the latest dind-rootless
container has a few issues. Rolled back the image version for docker
to docker:24.0.6-dind-rootless
and that solves some problems.
The second problem is the assumed socket used by docker which is defined in the docs as --host=unix:///var/run/docker.sock
. After removing this argument from the command and letting the service choose whatever socket it wants, it chose the socket based on the UID: unix:///run/user/1001/docker.sock
With these two changes, it works. Here is the working PodSpec template:
template:
spec:
volumes:
- name: tmpdir
emptyDir: {}
- name: work
emptyDir: {}
- name: dind-externals
emptyDir: {}
- name: dind-sock
emptyDir: {}
- name: dind-etc
emptyDir: {}
- name: dind-home
emptyDir: {}
initContainers:
- name: init-dind-externals
image: ghcr.io/actions/actions-runner:latest
command: ["cp", "-r", "-v", "/home/runner/externals/.", "/home/runner/tmpDir/"]
volumeMounts:
- name: dind-externals
mountPath: /home/runner/tmpDir
- name: init-dind-rootless
image: docker:24.0.6-dind-rootless
command:
- sh
- -c
- |
set -x
cp -a /etc/. /dind-etc/
echo 'runner:x:1001:1001:runner:/home/runner:/bin/ash' >> /dind-etc/passwd
echo 'runner:x:1001:' >> /dind-etc/group
echo 'runner:100000:65536' >> /dind-etc/subgid
echo 'runner:100000:65536' >> /dind-etc/subuid
chmod 755 /dind-etc;
chmod u=rwx,g=rx+s,o=rx /dind-home
chown 1001:1001 /dind-home
securityContext:
runAsUser: 0
volumeMounts:
- mountPath: /dind-etc
name: dind-etc
- mountPath: /dind-home
name: dind-home
containers:
- name: runner
image: ghcr.io/actions/actions-runner:latest
command: ["/home/runner/run.sh"]
env:
- name: DOCKER_HOST
value: unix:///run/user/1001/docker.sock
volumeMounts:
- mountPath: /tmp
name: tmpdir
- name: work
mountPath: /home/runner/_work
- name: dind-sock
mountPath: /var/run
- name: dind
image: docker:24.0.6-dind-rootless
args:
- dockerd
securityContext:
privileged: true
runAsUser: 1001
runAsGroup: 1001
volumeMounts:
- name: work
mountPath: /home/runner/_work
- name: dind-sock
mountPath: /var/run
- name: dind-externals
mountPath: /home/runner/externals
- name: dind-etc
mountPath: /etc
- name: dind-home
mountPath: /home/runner
Are you on GKE COS nodes? I was able to get things started by building an Ubuntu node pool and pining my containers there.
edit:
To add more details here I get the following error when running on COS based images in GKE regardless of utilizing docker:24.0.6-dind-rootless
or docker:dind-rootless
Error Message:
time="2024-05-03T22:57:33.920775537Z" level=info msg="unable to detect if iptables supports xlock: 'iptables --wait -L -n': `iptables v1.8.10 (nf_tables): Could not fetch rule set generation id: Invalid argument`" error="exit status 4"
time="2024-05-03T22:57:33.947346735Z" level=info msg="stopping event stream following graceful shutdown" error="<nil>" module=libcontainerd namespace=moby
time="2024-05-03T22:57:33.947888475Z" level=info msg="stopping healthcheck following graceful shutdown" module=libcontainerd
time="2024-05-03T22:57:33.947924935Z" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=plugins.moby
failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to register "bridge" driver: failed to create NAT chain DOCKER: iptables failed: iptables -t nat -N DOCKER: iptables v1.8.10 (nf_tables): Could not fetch rule set generation id: Invalid argument
(exit status 4)
[rootlesskit:child ] error: command [docker-init -- dockerd --host=unix:///socket/docker.sock] exited: exit status 1
[rootlesskit:parent] error: child exited: exit status 1
The GKE Ubuntu based OS image seems to start fine for either.
@dillon-cullinan I also don't believe that echo 'runner:x:1001:1001:runner:/home/runner:/bin/ash' >> /dind-etc/passwd
is a typo of bash I believe this is an image without bash installed and it should be /bin/ash. You can see the unmodified file in the dind-rootless container are all /bin/ash
docker run -it --rm --entrypoint /bin/sh docker:dind-rootless
/ $ cat /etc/passwd
root:x:0:0:root:/root:/bin/ash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/mail:/sbin/nologin
news:x:9:13:news:/usr/lib/news:/sbin/nologin
uucp:x:10:14:uucp:/var/spool/uucppublic:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
man:x:13:15:man:/usr/man:/sbin/nologin
postmaster:x:14:12:postmaster:/var/mail:/sbin/nologin
cron:x:16:16:cron:/var/spool/cron:/sbin/nologin
ftp:x:21:21::/var/lib/ftp:/sbin/nologin
sshd:x:22:22:sshd:/dev/null:/sbin/nologin
at:x:25:25:at:/var/spool/cron/atjobs:/sbin/nologin
squid:x:31:31:Squid:/var/cache/squid:/sbin/nologin
xfs:x:33:33:X Font Server:/etc/X11/fs:/sbin/nologin
games:x:35:35:games:/usr/games:/sbin/nologin
cyrus:x:85:12::/usr/cyrus:/sbin/nologin
vpopmail:x:89:89::/var/vpopmail:/sbin/nologin
ntp:x:123:123:NTP:/var/empty:/sbin/nologin
smmsp:x:209:209:smmsp:/var/spool/mqueue:/sbin/nologin
guest:x:405:100:guest:/dev/null:/sbin/nologin
nobody:x:65534:65534:nobody:/:/sbin/nologin
dockremap:x:100:101:Linux User,,,:/home/dockremap:/sbin/nologin
rootless:x:1000:1000:Rootless:/home/rootless:/bin/ash
/ $ ls -la /bin/ash
lrwxrwxrwx 1 root root 12 Jan 26 17:53 /bin/ash -> /bin/busybox
/ $
The socket problem is for sure an issue I fought with last week. I ended up putting my socket in a volume and sharing it to /var/run/docker.sock. This is mostly due to caution as I saw this issue hanging out there https://github.com/actions/actions-runner-controller/issues/2519 where if your socket isn't at /var/run/docker.sock on the runner container side bad things happened, and I wasn't sure if that was all fixed or not.
Are you on GKE COS nodes? I was able to get things started by building an Ubuntu node pool and pining my containers there.
edit: To add more details here I get the following error when running on COS based images in GKE regardless of utilizing
docker:24.0.6-dind-rootless
ordocker:dind-rootless
Error Message:
time="2024-05-03T22:57:33.920775537Z" level=info msg="unable to detect if iptables supports xlock: 'iptables --wait -L -n': `iptables v1.8.10 (nf_tables): Could not fetch rule set generation id: Invalid argument`" error="exit status 4" time="2024-05-03T22:57:33.947346735Z" level=info msg="stopping event stream following graceful shutdown" error="<nil>" module=libcontainerd namespace=moby time="2024-05-03T22:57:33.947888475Z" level=info msg="stopping healthcheck following graceful shutdown" module=libcontainerd time="2024-05-03T22:57:33.947924935Z" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=plugins.moby failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to register "bridge" driver: failed to create NAT chain DOCKER: iptables failed: iptables -t nat -N DOCKER: iptables v1.8.10 (nf_tables): Could not fetch rule set generation id: Invalid argument (exit status 4) [rootlesskit:child ] error: command [docker-init -- dockerd --host=unix:///socket/docker.sock] exited: exit status 1 [rootlesskit:parent] error: child exited: exit status 1
The GKE Ubuntu based OS image seems to start fine for either.
Yes, we are using GKE COS and we have it working right now, its interesting you are running into issues as well despite the changes. We are using gke
version 1.28.7-gke.1026000
just in case this matters.
@dillon-cullinan I also don't believe that
echo 'runner:x:1001:1001:runner:/home/runner:/bin/ash' >> /dind-etc/passwd
is a typo of bash I believe this is an image without bash installed and it should be /bin/ash. You can see the unmodified file in the dind-rootless container are all /bin/ash[...]
Thank you for the correction, I've edited my previous comment.
On RunnerDeployments the setup is much easier from what I've experienced. The PodSpec has a value you set: dockerdWithinRunnerContainer: true
.
For our containers we basically pulled bits and pieces from here: https://github.com/actions/actions-runner-controller/blob/master/runner/actions-runner-dind-rootless.ubuntu-20.04.dockerfile
Added the relevant lines from that Dockerfile into our custom stuff and it worked, you can probably just use this image as a base if it fits your use case.
Snippet of the runner
container values:
command:
- bash
- -c
- "mkdir -p /home/runner/.docker/docker /home/runner/.local/share && ln -s /home/runner/.docker/docker /home/runner/.local/share/docker && /bin/bash /usr/bin/entrypoint-dind-rootless.sh"
securityContext:
privileged: true
With the dockerd value set and the proper image, it all works with a singular container inside the pod, no dind
container, no init
containers. Much cleaner in general.
We are currently on 1.26 due to many many developers that won't move off deprecated API versions for a few things. I will see if we can get to 1.28 and try again.
I ended up putting my socket in a volume and sharing it to /var/run/docker.sock.
What other adjustments did you need to make to do this? I assumed simply having the emptyDir dind-sock
mounted to /var/run
in both containers would be enough but obviously not.
Checks
Controller Version
0.9.0
Deployment Method
Helm
Checks
To Reproduce
Describe the bug
Documentation does not work for rootless dind, and previous functionality that existed in RunnerDeployment was removed, breaking an already existing solution.
Describe the expected behavior
dind
container should exit cleanly allowing for docker usage on therunner
container.Additional Context
Controller Logs
Runner Pod Logs