Open MysticalMount opened 1 year ago
I digged a bit through the error messages, ended up dropping CAP_SYS_MODULE
from concourse worker/runtime/spec/capabilities.go
, but then I get a slightly different error message from runc
:
runc run: exit status 1: runc run failed: unable to start container process: unable to apply caps: operation not permitted
This was essentially that patch:
commit af3cebb55c01a298b69243517e72b268665b9e2b
Author: Florian Klink <flokli@flokli.de>
Date: Thu Jul 13 14:27:28 2023 +0300
worker: drop CAP_SYS_MODULE from the list of capabilities
`worker/runtime/spec/spec.go@defaultGardenOCISpec` calls out to
`OciCapabilities(privileged bool)`, returning a list of capabilities to
put in the OCI spec, which is then passed to runc.
Note this is independent of what the container payload might actually
need, it always asks for these capabilities.
This causes problems when running concourse-worker in a Talos cluster,
which does not allow asking for CAP_SYS_MODULE and CAP_SYS_BOOT
(Concourse doesn't ask for the latter):
concourse-worker-2 concourse-worker {"timestamp":"2023-07-13T11:17:47.529591744Z","level":"error","source":"guardian","message":"guardian.api.garden-server.create.failed","data":{"error":"runc run: exit status 1: container_linux.go:380: starting container process caused: apply caps: operation not permitted","request":{"Handle":"af712415-e9aa-4ba7-639f-b291f6e2caaf","GraceTime":0,"RootFSPath":"raw:///concourse-work-dir/volumes/live/e5bce4ac-4d45-45c5-6338-38aaaaf27e72/volume","BindMounts":[{"src_path":"/concourse-work-dir/volumes/live/1b925d7a-e33b-41dc-6f4f-9cdc701583f0/volume","dst_path":"/scratch","mode":1}],"Network":"","Privileged":true,"Limits":{"bandwidth_limits":{},"cpu_limits":{},"disk_limits":{},"memory_limits":{},"pid_limits":{}}},"session":"3.1.140807"}}
```
See https://www.talos.dev/v1.4/learn-more/process-capabilities/ for
details.
Removing that CAP from the list should get runc to successfully execute
in Talos clusters. It might cause problems for people trying to modprobe
kernel modules inside Concourse, but I hope noone does that ;-)
Signed-off-by: Florian Klink <flokli@flokli.de>
diff --git a/worker/runtime/spec/capabilities.go b/worker/runtime/spec/capabilities.go index b38c32f4a..6443c1fb0 100644 --- a/worker/runtime/spec/capabilities.go +++ b/worker/runtime/spec/capabilities.go @@ -72,7 +72,6 @@ var ( "CAP_SYS_ADMIN", "CAP_SYS_BOOT", "CAP_SYS_CHROOT",
A version of this was pushed to flokli/concourse:20230713-01
.
I went ahead and patched OciCapabilities
to always return UnprivilegedContainerCapabilities
, just to see how far it'd get:
commit 043babd9347f4e671e5e03f22b1a3d9065fac5bb
Author: Florian Klink <flokli@flokli.de>
Date: Thu Jul 13 15:37:16 2023 +0300
HACK
diff --git a/worker/runtime/spec/capabilities.go b/worker/runtime/spec/capabilities.go
index 6443c1fb0..2600f0001 100644
--- a/worker/runtime/spec/capabilities.go
+++ b/worker/runtime/spec/capabilities.go
@@ -3,11 +3,7 @@ package spec
import "github.com/opencontainers/runtime-spec/specs-go"
func OciCapabilities(privileged bool) specs.LinuxCapabilities {
- if !privileged {
- return UnprivilegedContainerCapabilities
- }
-
- return PrivilegedContainerCapabilities
+ return UnprivilegedContainerCapabilities
}
var (
A version of this was pushed to flokli/concourse:20230713-02
.
With that, runc
fails with runc run failed: unable to start container process: can't get final child's PID from pipe: EOF
It looks like the Concourse model of running runc inside privileged pods gets more and more incompatible with more recent/secure versions of Kubernetes.
I'm not sure how much further time I'm willing to spend on trying to get this working - https://github.com/concourse/concourse/issues/5682 sounds like a more sustainable long-term solution.
Hmmh, concourse adds both CAP_SYS_BOOT and CAP_SYS_MODULE, I just got tricked by the Talos documentation calling it wrong (fixed in https://github.com/siderolabs/talos/pull/7473). I'll re-roll the first patch and see what dropping both capabilities will do:
commit 92d624adbb1c7d4e855602703f6a81387a8868d8 (HEAD)
Author: Florian Klink <flokli@flokli.de>
Date: Thu Jul 13 14:27:28 2023 +0300
worker: drop CAP_SYS_{BOOT,MODULE} from the list of capabilities
`worker/runtime/spec/spec.go@defaultGardenOCISpec` calls out to
`OciCapabilities(privileged bool)`, returning a list of capabilities to
put in the OCI spec, which is then passed to runc.
Note this is independent of what the container payload might actually
need, it always asks for these capabilities.
This causes problems when running concourse-worker in a Talos cluster,
which does not allow asking for CAP_SYS_MODULE and CAP_SYS_BOOT
(Concourse doesn't ask for the latter):
concourse-worker-2 concourse-worker {"timestamp":"2023-07-13T11:17:47.529591744Z","level":"error","source":"guardian","message":"guardian.api.garden-server.create.failed","data":{"error":"runc run: exit status 1: container_linux.go:380: starting container process caused: apply caps: operation not permitted","request":{"Handle":"af712415-e9aa-4ba7-639f-b291f6e2caaf","GraceTime":0,"RootFSPath":"raw:///concourse-work-dir/volumes/live/e5bce4ac-4d45-45c5-6338-38aaaaf27e72/volume","BindMounts":[{"src_path":"/concourse-work-dir/volumes/live/1b925d7a-e33b-41dc-6f4f-9cdc701583f0/volume","dst_path":"/scratch","mode":1}],"Network":"","Privileged":true,"Limits":{"bandwidth_limits":{},"cpu_limits":{},"disk_limits":{},"memory_limits":{},"pid_limits":{}}},"session":"3.1.140807"}}
```
See https://www.talos.dev/v1.4/learn-more/process-capabilities/ for
details.
Removing these capabilities from the list should get runc to
successfully execute in Talos clusters. It might cause problems for
people trying to modprobe kernel modules inside Concourse, but I hope
noone does that ;-)
Signed-off-by: Florian Klink <flokli@flokli.de>
diff --git a/worker/runtime/spec/capabilities.go b/worker/runtime/spec/capabilities.go index b38c32f4a..9819650a4 100644 --- a/worker/runtime/spec/capabilities.go +++ b/worker/runtime/spec/capabilities.go @@ -70,9 +70,7 @@ var ( "CAP_SETUID", "CAP_SYSLOG", "CAP_SYS_ADMIN",
Ok, with the new patch applied (pushed to flokli/concourse:20230713-03
), removing both of these two caps from the list, and adding all capabilities in the pod spec, I get the same runc run failed: unable to start container process: can't get final child's PID from pipe: EOF
.
That smells like an incompatibility, either with the cgroup structure in Talos, or assuming it's using Docker as an outer container runtime.
https://github.com/moby/moby/issues/40835#issuecomment-663397714 suggests this might be an issue with what mountpoints are seen inside the container, or with user namespace support, even though I'm a bit unsure where runc itself is emitting that error message…
I sent a PR containing the first patch to https://github.com/concourse/concourse/pull/8791.
Describe the bug
Ive deployed the workers to a privileged namespace:
Namespace: cc
On Kubernetes 1.26.1
When trying to run a hello world pipeline I get this using Guardian inside the worker pod:
Im fairly new to Concourse, so if Im missing something, sorry!
I can see that securityContext: privileged: true is set on the workers statefulset - in the source YAML and its also seemingly set in the resulting statefulset:
(Ive been adding the capabilities to try to resolve the issue)
As far as I can tell the container is privileged - I am also using TalosCtl, but cant find anything, thus far to suggest it it Talos related.
Any steps/help/advice on where to go next or what Ive missed welcome.
Reproduction steps
Expected behavior
Expected would be the container image to pull and start successfully
Additional context
In my setup Im using custom registries so expect some setup here, but suspect we are hitting this issue pre to that being the problem