bottlerocket-os / bottlerocket

An operating system designed for hosting containers
https://bottlerocket.dev
Other
8.81k stars 520 forks source link

Dockerd-rootless does not start #2887

Open Alien2150 opened 1 year ago

Alien2150 commented 1 year ago

Image I'm using: ami-003e59b5f03f35661

Running a helm chart using docker:20.10-dind-rootless. I already checked out the solutions mentioned here: https://github.com/bottlerocket-os/bottlerocket/issues/1569 but I get this error when I try to run Dockerd inside bottlerocket:

exec rootlesskit \
--mtu="${DOCKERD_ROOTLESS_ROOTLESSKIT_MTU:-1500}" \
--disable-host-loopback \
--copy-up=/etc \
--copy-up=/run \
--port-driver=builtin \
--net="${DOCKERD_ROOTLESS_ROOTLESSKIT_NET:-vpnkit}" \
${DOCKERD_ROOTLESS_ROOTLESSKIT_FLAGS:-} \
"$@"

time="2023-03-13T13:23:00Z" level=warning msg="failed to mount sysfs, falling back to read-only mount: operation not permitted"
[rootlesskit:child ] error: creating tap tap0: ioctl: permission denied

I tried to adjust the capabilities ofc:

securityContext:
  capabilities:
    add:
    - DAC_READ_SEARCH
    - NET_ADMIN
    - SYS_ADMIN
    - SYS_RESOURCE
  privileged: true

What I expected to happen:

What actually happened:

How to reproduce the problem:

jpmcb commented 1 year ago

Hi @Alien2150 - thanks for opening this issue! Is this related to using docker's rootless buildkit? You might find this issue relevant: https://github.com/bottlerocket-os/bottlerocket/issues/1934

Otherwise, there's likely SElinux policies that need to be relaxed somewhere.

How are you attempting to run this docker rootless container? Abit more info on this might be helpful.


What variant / image / region is this? I'm not seeing it in us-east-1, us-east-2, or us-west-2

❯ aws --region us-east-2 ec2 describe-images --image-id ami-003e59b5f03f35661

An error occurred (InvalidAMIID.NotFound) when calling the DescribeImages operation: The image id '[ami-003e59b5f03f35661]' does not exist
Alien2150 commented 1 year ago

Super helpful @jpculp . I am running eks in eu-west-1. I will check out the Profiles. Thanks for sharing.

chlunde commented 1 year ago

@Alien2150 Did you get this to work?

Alien2150 commented 1 year ago

@Alien2150 Did you get this to work?

I did not followed up/ check recently. Did you check the related issue? Maybe it works nowadays. I will see if I can test it on my end 👀

chlunde commented 1 year ago

@Alien2150 No, I did didn't mange to find any concrete suggestions there.

PS! We're also looking at which will help with additional hardening: https://github.com/open-policy-agent/opa-docker-authz

alexnovak commented 1 month ago

Forgive the necromancy on this issue - but I think this is due to the SElinux configuration for bottlerocket. I'm far from an SElinux expert - but I'm basing my understanding on https://github.com/containers/container-selinux/issues/104

Namely

if you open /dev/net/tun and do ioctl TUNSETIFF to get a socket for the desired tap/tun device, the kernel checks to ensure you have "relabelto" and "relabelfrom" first, and then copies the selinux label to the new socket that will be used to communicate with your existing tun/tap

Based on my read of the cil, we establish our relabelto and relabelfrom permissions for the tun socket in the sockets relabel classmapping here.

The sockets relabel classmapping is exclusively granted to trusted_s here.

And trusted_s is granted to a small number of types - explicitly removing control_t https://github.com/bottlerocket-os/bottlerocket-core-kit/blob/dc8fee0f5b28c68f683a13f72f429e39ff9ca0da/packages/selinux-policy/subject.cil#L82-L84. control_t being the default for privileged containers.

So even if you were to run something as privileged - it can't create a tun/tap. Is that desired? Access to virtual network devices is pretty desirable for some workflows.