nestybox / sysbox

An open-source, next-generation "runc" that empowers rootless containers to run workloads such as Systemd, Docker, Kubernetes, just like VMs.
Apache License 2.0
2.64k stars 146 forks source link

2020/10/21/gitlab-dind #206

Open utterances-bot opened 3 years ago

utterances-bot commented 3 years ago

Securing GitLab CI pipelines with Sysbox | Nestybox Blog Site

Article describing how to improve security of GitLab using Sysbox.

https://blog.nestybox.com/2020/10/21/gitlab-dind.html

candrews commented 3 years ago

GitLab appears to be waiting for someone to make a merge request fixing the GitLab runner ignores the "runtime" configuration for service containers issue in the GitLab runner: https://gitlab.com/gitlab-org/gitlab-runner/-/issues/27168

I'm eager to see this fixed; having sysbox work "out of the box" with GitLab is really exciting. Thank you for your work and for sharing this information!

lukasmrtvy commented 3 years ago

Any example for Kubernetes Gitlab runner ?

ctalledo commented 3 years ago

We are currently working on integrating Kubernetes + Sysbox, to make it possible for K8s to deploy pods that are capable of running things like systemd, Docker, and K8s inside the pod itself, and do so securely (with rootless containers). We expect to have this very soon (weeks).

Once we have this, then using the GitLab K8s executor should be fairly straight-forward, assuming we can tell the GitLab K8s executor to deploy the pod using the Sysbox "runtime class". I would need to take a closer look as the devil is in the details.

smarsching commented 3 years ago

I found a workaround to make the setup from the β€œSetups that (currently) don’t work” section work. πŸ™‚

The trick is to run the GitLab runner Docker container using the traditional β€œrunc” runtime, but have the Docker containers spawned by the GitLab runner use the β€œsysbox-runc” runtime.

While we have to make β€œsysbox-runc” the default runtime used by Docker (because setting the runtime used by the GitLab runner does not work reliably as described in the article), we can override the runtime used for the GitLab runner itself.

This can be done by simply adding β€œ--runtime runc” when creating the container. For example:

docker run -d -v /path/to/gitlab-runner-config:/etc/gitlab-runner --runtime runc gitlab/gitlab-runner:alpine-v13.10.0
ctalledo commented 3 years ago

Thanks @smarsching, that's clever! Glad you found a way to do it since it's more convenient to use the containerized gitlab runner.

networkException commented 2 years ago

I've decided to ping subscribed people on here as well so that anyone interested can enable notifications on the merge request

I recently created a merge request to gitlab running resolving the issue mentioned above, its quite a simple change: https://gitlab.com/gitlab-org/gitlab-runner/-/merge_requests/3063

ctalledo commented 2 years ago

I recently created a merge request to gitlab running resolving the issue mentioned above, its quite a simple change: https://gitlab.com/gitlab-org/gitlab-runner/-/merge_requests/3063

Very kind of you @networkException! (I suspected it was a simple change but never got the chance to do it and test it ... thanks)

networkException commented 2 years ago

The latest release 14.4.0 now includes the change :tada:

ctalledo commented 2 years ago

Thanks @networkException for bringing this to a resolution, much appreciated!

candrews commented 2 years ago

Once we have this, then using the GitLab K8s executor should be fairly straight-forward, assuming we can tell the GitLab K8s executor to deploy the pod using the Sysbox "runtime class". I would need to take a closer look as the devil is in the details.

I believe that this GitLab MR would need to be merge to do that: https://gitlab.com/gitlab-org/gitlab-runner/-/merge_requests/2326 Can you please confirm my understanding?

ctalledo commented 2 years ago

I believe that this GitLab MR would need to be merge to do that: https://gitlab.com/gitlab-org/gitlab-runner/-/merge_requests/2326 Can you please confirm my understanding?

Thanks @candrews ... looks like that MR is not moving fast unfortunately ...

himekifee commented 2 years ago

Having some issue with GitLab Runner & Docker in a System Container setup. runner config:

concurrent = 1
check_interval = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "my-docker-runner"
  url = "https://git.me.asdf/"
  token = "TOKEN"
  executor = "docker"
  [runners.custom_build_dir]
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    tls_verify = false
    image = "docker:20.10.12"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache", "/var/lib/docker"]
    shm_size = 0
    runtime = "sysbox-runc"

CI reported

Using Docker executor with image docker:20.10.12 ...
ERROR: Preparation failed: adding cache volume: set volume permissions: running permission container "33f2c95b8eacf22bba2f8480ed1e947afacb370bc909cb527b8dfc54c550b455" for volume "runner-y5eqdgvc-project-7-concurrent-0-cache-3c3f060a0374fc8bc39395164f415a70": starting permission container: Error response from daemon: cgroups: cgroup mountpoint does not exist: unknown (linux_set.go:100:2s)

sysbox running, /etc/docker/daemon.json

{
  "default-runtime": "sysbox-runc",
  "runtimes": {
    "sysbox-runc": {
      "path": "/usr/bin/sysbox-runc",
      "runtimeArgs": ["--no-kernel-check"]
    }
  }
}

Inside the docker I got

root@7022cd7fae8d:/# ps -aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.0    216     4 ?        Ss   01:20   0:00 /usr/bin/dumb-init /entrypoint run --user=gitlab-runner --working-directory=/home/gitlab-runner
root          52  0.6  0.0 140640 16280 ?        Ssl  01:20   0:02 gitlab-runner run --user=gitlab-runner --working-directory=/home/gitlab-runner
root          54  1.8  0.4 2190480 71996 ?       Sl   01:20   0:07 dockerd
root          86  0.4  0.1 1634128 24136 ?       Ssl  01:20   0:01 containerd --config /var/run/docker/containerd/containerd.toml --log-level info
root         605  0.0  0.0 410112  9224 ?        Ssl  01:20   0:00 runc init
root         834  0.0  0.0 410112  9136 ?        Ssl  01:20   0:00 runc init
root        1046  0.0  0.0 410112  9264 ?        Ssl  01:20   0:00 runc init
root        1268  0.0  0.0 410112 10684 ?        Ssl  01:30   0:00 runc init
root        1482  0.0  0.0 410112 10760 ?        Ssl  01:30   0:00 runc init
root        1696  0.0  0.0 410112 10616 ?        Ssl  01:31   0:00 runc init
root        1800  0.0  0.0   4252  3376 pts/0    Ss   01:37   0:00 bash
root        1808  0.0  0.0   5896  2876 pts/0    R+   01:37   0:00 ps -aux

and

root@7022cd7fae8d:/var/log# cat dockerd.log 
time="2022-02-28T01:20:19.818754521Z" level=info msg="Starting up"
time="2022-02-28T01:20:19.859157920Z" level=info msg="libcontainerd: started new containerd process" pid=86
time="2022-02-28T01:20:19.859220455Z" level=info msg="parsed scheme: \"unix\"" module=grpc
time="2022-02-28T01:20:19.859243557Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
time="2022-02-28T01:20:19.859281116Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock 0  <nil>}] <nil>}" module=grpc
time="2022-02-28T01:20:19.859311612Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
time="2022-02-28T01:20:20.246443014Z" level=info msg="starting containerd" revision=8fba4e9a7d01810a393d5d25a3621dc101981175 version=1.3.7
time="2022-02-28T01:20:20.274563494Z" level=info msg="loading plugin \"io.containerd.content.v1.content\"..." type=io.containerd.content.v1
time="2022-02-28T01:20:20.274797787Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." type=io.containerd.snapshotter.v1
time="2022-02-28T01:20:20.275445940Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.devmapper\"..." type=io.containerd.snapshotter.v1
time="2022-02-28T01:20:20.275491508Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.devmapper" error="devmapper not configured"
time="2022-02-28T01:20:20.275518269Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.aufs\"..." type=io.containerd.snapshotter.v1
time="2022-02-28T01:20:20.275639513Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.aufs\"..." error="modprobe aufs failed: \"\": exec: \"modprobe\": executable file not found in $PATH: skip plugin" type=io.containerd.snapshotter.v1
time="2022-02-28T01:20:20.275677991Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.native\"..." type=io.containerd.snapshotter.v1
time="2022-02-28T01:20:20.275757916Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.overlayfs\"..." type=io.containerd.snapshotter.v1
time="2022-02-28T01:20:20.275972875Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.zfs\"..." type=io.containerd.snapshotter.v1
time="2022-02-28T01:20:20.276260547Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.zfs\"..." error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
time="2022-02-28T01:20:20.276298811Z" level=info msg="loading plugin \"io.containerd.metadata.v1.bolt\"..." type=io.containerd.metadata.v1
time="2022-02-28T01:20:20.276374799Z" level=warning msg="could not use snapshotter devmapper in metadata plugin" error="devmapper not configured"
time="2022-02-28T01:20:20.276415778Z" level=info msg="metadata content store policy set" policy=shared
time="2022-02-28T01:20:20.358517881Z" level=info msg="loading plugin \"io.containerd.differ.v1.walking\"..." type=io.containerd.differ.v1
time="2022-02-28T01:20:20.358570615Z" level=info msg="loading plugin \"io.containerd.gc.v1.scheduler\"..." type=io.containerd.gc.v1
time="2022-02-28T01:20:20.358638176Z" level=info msg="loading plugin \"io.containerd.service.v1.containers-service\"..." type=io.containerd.service.v1
time="2022-02-28T01:20:20.358672303Z" level=info msg="loading plugin \"io.containerd.service.v1.content-service\"..." type=io.containerd.service.v1
time="2022-02-28T01:20:20.358698383Z" level=info msg="loading plugin \"io.containerd.service.v1.diff-service\"..." type=io.containerd.service.v1
time="2022-02-28T01:20:20.358724040Z" level=info msg="loading plugin \"io.containerd.service.v1.images-service\"..." type=io.containerd.service.v1
time="2022-02-28T01:20:20.358762270Z" level=info msg="loading plugin \"io.containerd.service.v1.leases-service\"..." type=io.containerd.service.v1
time="2022-02-28T01:20:20.358793181Z" level=info msg="loading plugin \"io.containerd.service.v1.namespaces-service\"..." type=io.containerd.service.v1
time="2022-02-28T01:20:20.358833557Z" level=info msg="loading plugin \"io.containerd.service.v1.snapshots-service\"..." type=io.containerd.service.v1
time="2022-02-28T01:20:20.358863334Z" level=info msg="loading plugin \"io.containerd.runtime.v1.linux\"..." type=io.containerd.runtime.v1
time="2022-02-28T01:20:20.359072826Z" level=info msg="loading plugin \"io.containerd.runtime.v2.task\"..." type=io.containerd.runtime.v2
time="2022-02-28T01:20:20.359263802Z" level=info msg="loading plugin \"io.containerd.monitor.v1.cgroups\"..." type=io.containerd.monitor.v1
time="2022-02-28T01:20:20.359703431Z" level=info msg="loading plugin \"io.containerd.service.v1.tasks-service\"..." type=io.containerd.service.v1
time="2022-02-28T01:20:20.359762373Z" level=info msg="loading plugin \"io.containerd.internal.v1.restart\"..." type=io.containerd.internal.v1
time="2022-02-28T01:20:20.359847512Z" level=info msg="loading plugin \"io.containerd.grpc.v1.containers\"..." type=io.containerd.grpc.v1
time="2022-02-28T01:20:20.359882761Z" level=info msg="loading plugin \"io.containerd.grpc.v1.content\"..." type=io.containerd.grpc.v1
time="2022-02-28T01:20:20.359908481Z" level=info msg="loading plugin \"io.containerd.grpc.v1.diff\"..." type=io.containerd.grpc.v1
time="2022-02-28T01:20:20.359932453Z" level=info msg="loading plugin \"io.containerd.grpc.v1.events\"..." type=io.containerd.grpc.v1
time="2022-02-28T01:20:20.359955078Z" level=info msg="loading plugin \"io.containerd.grpc.v1.healthcheck\"..." type=io.containerd.grpc.v1
time="2022-02-28T01:20:20.359981907Z" level=info msg="loading plugin \"io.containerd.grpc.v1.images\"..." type=io.containerd.grpc.v1
time="2022-02-28T01:20:20.360006443Z" level=info msg="loading plugin \"io.containerd.grpc.v1.leases\"..." type=io.containerd.grpc.v1
time="2022-02-28T01:20:20.360069419Z" level=info msg="loading plugin \"io.containerd.grpc.v1.namespaces\"..." type=io.containerd.grpc.v1
time="2022-02-28T01:20:20.360114914Z" level=info msg="loading plugin \"io.containerd.internal.v1.opt\"..." type=io.containerd.internal.v1
time="2022-02-28T01:20:20.360302141Z" level=info msg="loading plugin \"io.containerd.grpc.v1.snapshots\"..." type=io.containerd.grpc.v1
time="2022-02-28T01:20:20.360341458Z" level=info msg="loading plugin \"io.containerd.grpc.v1.tasks\"..." type=io.containerd.grpc.v1
time="2022-02-28T01:20:20.360367243Z" level=info msg="loading plugin \"io.containerd.grpc.v1.version\"..." type=io.containerd.grpc.v1
time="2022-02-28T01:20:20.360388960Z" level=info msg="loading plugin \"io.containerd.grpc.v1.introspection\"..." type=io.containerd.grpc.v1
time="2022-02-28T01:20:20.361831454Z" level=info msg=serving... address=/var/run/docker/containerd/containerd-debug.sock
time="2022-02-28T01:20:20.362178600Z" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock.ttrpc
time="2022-02-28T01:20:20.362454885Z" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock
time="2022-02-28T01:20:20.362501968Z" level=info msg="containerd successfully booted in 0.117422s"
time="2022-02-28T01:20:20.488434843Z" level=info msg="parsed scheme: \"unix\"" module=grpc
time="2022-02-28T01:20:20.489927253Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
time="2022-02-28T01:20:20.492620071Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock 0  <nil>}] <nil>}" module=grpc
time="2022-02-28T01:20:20.492699880Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
time="2022-02-28T01:20:20.494356705Z" level=info msg="parsed scheme: \"unix\"" module=grpc
time="2022-02-28T01:20:20.494393053Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
time="2022-02-28T01:20:20.494421326Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock 0  <nil>}] <nil>}" module=grpc
time="2022-02-28T01:20:20.494465278Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
time="2022-02-28T01:20:20.723517826Z" level=warning msg="Your kernel does not support swap memory limit"
time="2022-02-28T01:20:20.723586602Z" level=warning msg="Your kernel does not support memory reservation"
time="2022-02-28T01:20:20.723610944Z" level=warning msg="Your kernel does not support oom control"
time="2022-02-28T01:20:20.723629769Z" level=warning msg="Your kernel does not support memory swappiness"
time="2022-02-28T01:20:20.723647351Z" level=warning msg="Your kernel does not support kernel memory limit"
time="2022-02-28T01:20:20.723664270Z" level=warning msg="Your kernel does not support kernel memory TCP limit"
time="2022-02-28T01:20:20.723680605Z" level=warning msg="Your kernel does not support cgroup cpu shares"
time="2022-02-28T01:20:20.723696804Z" level=warning msg="Your kernel does not support cgroup cfs period"
time="2022-02-28T01:20:20.723713182Z" level=warning msg="Your kernel does not support cgroup cfs quotas"
time="2022-02-28T01:20:20.723729778Z" level=warning msg="Your kernel does not support cgroup rt period"
time="2022-02-28T01:20:20.723746838Z" level=warning msg="Your kernel does not support cgroup rt runtime"
time="2022-02-28T01:20:20.723763458Z" level=warning msg="Unable to find blkio cgroup in mounts"
time="2022-02-28T01:20:20.724088492Z" level=info msg="Loading containers: start."
time="2022-02-28T01:20:21.593906973Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.18.0.0/16. Daemon option --bip can be used to set a preferred IP address"
time="2022-02-28T01:20:21.952929712Z" level=info msg="Loading containers: done."
time="2022-02-28T01:20:22.166973468Z" level=info msg="Docker daemon" commit=4484c46d9d graphdriver(s)=btrfs version=19.03.13
time="2022-02-28T01:20:22.167229640Z" level=info msg="Daemon has completed initialization"
time="2022-02-28T01:20:22.535504251Z" level=info msg="API listen on /var/run/docker.sock"
time="2022-02-28T01:20:22.647414125Z" level=error msg="Handler for POST /v1.25/images/create returned error: exec: \"xz\": executable file not found in $PATH"
time="2022-02-28T01:20:41.320195242Z" level=info msg="shim containerd-shim started" address=/containerd-shim/5759075674e435004a4daa154d4d622830024f65fc88f27f668380072e9cb4e4.sock debug=false pid=588
time="2022-02-28T01:20:42.106275598Z" level=info msg="shim reaped" id=33f65322b7fef822803f823edd4832994471a3fb96b01a134e4d55df3d42f88a
time="2022-02-28T01:20:42.445521731Z" level=error msg="33f65322b7fef822803f823edd4832994471a3fb96b01a134e4d55df3d42f88a cleanup: failed to delete container from containerd: no such container"
time="2022-02-28T01:20:42.445606722Z" level=error msg="Handler for POST /v1.25/containers/33f65322b7fef822803f823edd4832994471a3fb96b01a134e4d55df3d42f88a/start returned error: cgroups: cgroup mountpoint does not exist: unknown"
time="2022-02-28T01:20:48.704942066Z" level=info msg="shim containerd-shim started" address=/containerd-shim/2f332b46911b581793eb49be516ae7d88ffb66dcc35eb2bc91ba66d79dda4139.sock debug=false pid=817
time="2022-02-28T01:20:49.299950798Z" level=info msg="shim reaped" id=9004e4244d5101163b10f38c6f5982b66c58e870dd3006b5a7449503fc424534
time="2022-02-28T01:20:49.829250435Z" level=error msg="9004e4244d5101163b10f38c6f5982b66c58e870dd3006b5a7449503fc424534 cleanup: failed to delete container from containerd: no such container"
time="2022-02-28T01:20:49.829321986Z" level=error msg="Handler for POST /v1.25/containers/9004e4244d5101163b10f38c6f5982b66c58e870dd3006b5a7449503fc424534/start returned error: cgroups: cgroup mountpoint does not exist: unknown"
time="2022-02-28T01:20:56.057441646Z" level=info msg="shim containerd-shim started" address=/containerd-shim/8bb5cf11abecd3458433532cde6b448750be1132fc4790e21cb88e3fda551fe0.sock debug=false pid=1029
time="2022-02-28T01:20:56.619058647Z" level=info msg="shim reaped" id=dd1d9ea5e572f070c2f72e3e4dba907908cd9f3c0bf75991372cc542593390fb
time="2022-02-28T01:20:56.929066353Z" level=error msg="dd1d9ea5e572f070c2f72e3e4dba907908cd9f3c0bf75991372cc542593390fb cleanup: failed to delete container from containerd: no such container"
time="2022-02-28T01:20:56.929145357Z" level=error msg="Handler for POST /v1.25/containers/dd1d9ea5e572f070c2f72e3e4dba907908cd9f3c0bf75991372cc542593390fb/start returned error: cgroups: cgroup mountpoint does not exist: unknown"
time="2022-02-28T01:30:52.457780916Z" level=info msg="shim containerd-shim started" address=/containerd-shim/12a3d207e4f8ad82b0d43e2a8953f1e1bda9c23d2611cbe0c210a65f8852acd8.sock debug=false pid=1251
time="2022-02-28T01:30:53.925225856Z" level=info msg="shim reaped" id=33f2c95b8eacf22bba2f8480ed1e947afacb370bc909cb527b8dfc54c550b455
time="2022-02-28T01:30:54.246293849Z" level=error msg="33f2c95b8eacf22bba2f8480ed1e947afacb370bc909cb527b8dfc54c550b455 cleanup: failed to delete container from containerd: no such container"
time="2022-02-28T01:30:54.246387704Z" level=error msg="Handler for POST /v1.25/containers/33f2c95b8eacf22bba2f8480ed1e947afacb370bc909cb527b8dfc54c550b455/start returned error: cgroups: cgroup mountpoint does not exist: unknown"
time="2022-02-28T01:30:59.179262838Z" level=info msg="shim containerd-shim started" address=/containerd-shim/e3c0b990725d394dd9298339f5236f82fb8057dbd5006ef67ce12582b5dc3d85.sock debug=false pid=1465
time="2022-02-28T01:30:59.845653311Z" level=info msg="shim reaped" id=12ea94bba53edb145abc1ac5bcc4fc60e82bb13b8f889cbafd2911f5bfaa6b8f
time="2022-02-28T01:31:00.391769662Z" level=error msg="12ea94bba53edb145abc1ac5bcc4fc60e82bb13b8f889cbafd2911f5bfaa6b8f cleanup: failed to delete container from containerd: no such container"
time="2022-02-28T01:31:00.391838217Z" level=error msg="Handler for POST /v1.25/containers/12ea94bba53edb145abc1ac5bcc4fc60e82bb13b8f889cbafd2911f5bfaa6b8f/start returned error: cgroups: cgroup mountpoint does not exist: unknown"
time="2022-02-28T01:31:04.798290638Z" level=info msg="shim containerd-shim started" address=/containerd-shim/49e4c282f323a79ec0f123273118a81934cb0895365ac95410fbc0c15b68c76b.sock debug=false pid=1679
time="2022-02-28T01:31:05.376424965Z" level=info msg="shim reaped" id=63fd5d823567c8d4d6598f2790ba3be0ded3d81a633e387ee20640d5851cfb3a
time="2022-02-28T01:31:06.028669401Z" level=error msg="63fd5d823567c8d4d6598f2790ba3be0ded3d81a633e387ee20640d5851cfb3a cleanup: failed to delete container from containerd: no such container"
time="2022-02-28T01:31:06.028736218Z" level=error msg="Handler for POST /v1.25/containers/63fd5d823567c8d4d6598f2790ba3be0ded3d81a633e387ee20640d5851cfb3a/start returned error: cgroups: cgroup mountpoint does not exist: unknown"

Did I get anything wrong here? Thanks.

ctalledo commented 2 years ago

Hi @himekifee, thanks for giving Sysbox a shot. On a quick pass, I would suspect some issue with the host kernel regarding cgroups. What host distro / kernel are you using? (e.g., cat /etc/os-release and uname -a).

himekifee commented 2 years ago

Hi, it is up to date ArchLinux installation with kernel Linux 5.16.11-arch1-1 #1 SMP PREEMPT Thu, 24 Feb 2022 02:18:20 +0000 x86_64 GNU/Linux

ctalledo commented 2 years ago

Thanks @himekifee. Sysbox is not officially supported on ArchLinux yet, but happy to help you figure out what's going on.

Inside the Sysbox container (i.e., where the GitLab runner agent is running in your scenario), how does the cgroup file hierarchy look (i.e., tree -L 1 /sys/fs/cgroup)?

I ask because it seems the entity emitting the error seems to think there is a problem with it: Error response from daemon: cgroups: cgroup mountpoint does not exist: unknown. It appears that the GitLab Docker Executor is invoking Docker, and the latter is reporting the error.

Also, what happens if you docker exec into that Sysbox container and launch a Docker container manually (e.g., docker run -it alpine)?

himekifee commented 2 years ago
root@bc4e3e71a570:/# tree -L 1 /sys/fs/cgroup
/sys/fs/cgroup
|-- cgroup.controllers
|-- cgroup.events
|-- cgroup.freeze
|-- cgroup.kill
|-- cgroup.max.depth
|-- cgroup.max.descendants
|-- cgroup.procs
|-- cgroup.stat
|-- cgroup.subtree_control
|-- cgroup.threads
|-- cgroup.type
|-- cpu.idle
|-- cpu.max
|-- cpu.max.burst
|-- cpu.pressure
|-- cpu.stat
|-- cpu.uclamp.max
|-- cpu.uclamp.min
|-- cpu.weight
|-- cpu.weight.nice
|-- cpuset.cpus
|-- cpuset.cpus.effective
|-- cpuset.cpus.partition
|-- cpuset.mems
|-- cpuset.mems.effective
|-- hugetlb.2MB.current
|-- hugetlb.2MB.events
|-- hugetlb.2MB.events.local
|-- hugetlb.2MB.max
|-- hugetlb.2MB.rsvd.current
|-- hugetlb.2MB.rsvd.max
|-- init.scope
|-- io.bfq.weight
|-- io.latency
|-- io.low
|-- io.max
|-- io.pressure
|-- io.prio.class
|-- io.stat
|-- io.weight
|-- memory.current
|-- memory.events
|-- memory.events.local
|-- memory.high
|-- memory.low
|-- memory.max
|-- memory.min
|-- memory.numa_stat
|-- memory.oom.group
|-- memory.pressure
|-- memory.stat
|-- memory.swap.current
|-- memory.swap.events
|-- memory.swap.high
|-- memory.swap.max
|-- misc.current
|-- misc.events
|-- misc.max
|-- pids.current
|-- pids.events
|-- pids.max
|-- rdma.current
`-- rdma.max

1 directory, 62 files

exec in the nestybox/gitlab-runner-docker container

ctalledo commented 2 years ago

Thanks; so there are 2 versions of cgroups, v1 and the newer v2, and your host has v2. FYI, the v1 hierarchy looks like this:

$ tree -L 1 /sys/fs/cgroup
/sys/fs/cgroup
β”œβ”€β”€ blkio
β”œβ”€β”€ cpu -> cpu,cpuacct
β”œβ”€β”€ cpuacct -> cpu,cpuacct
β”œβ”€β”€ cpu,cpuacct
β”œβ”€β”€ cpuset
β”œβ”€β”€ devices
β”œβ”€β”€ freezer
β”œβ”€β”€ hugetlb
β”œβ”€β”€ memory
β”œβ”€β”€ net_cls -> net_cls,net_prio
β”œβ”€β”€ net_cls,net_prio
β”œβ”€β”€ net_prio -> net_cls,net_prio
β”œβ”€β”€ perf_event
β”œβ”€β”€ pids
β”œβ”€β”€ rdma
β”œβ”€β”€ systemd
└── unified

It's fine to have cgroup v2 on the host, but I suspect the version of Docker running inside the Sysbox container is a bit old and does not support it, so it's complaining with cgroup mountpoint does not exist: unknown.

What's the version of Docker inside the Sysbox container? (docker version)

himekifee commented 2 years ago
root@bc4e3e71a570:/# docker version
Client: Docker Engine - Community
 Version:           19.03.13
 API version:       1.40
 Go version:        go1.13.15
 Git commit:        4484c46d9d
 Built:             Wed Sep 16 17:02:52 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.13
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       4484c46d9d
  Built:            Wed Sep 16 17:01:20 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.3.7
  GitCommit:        8fba4e9a7d01810a393d5d25a3621dc101981175
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683
root@bc4e3e71a570:/# docekr info
bash: docekr: command not found
root@bc4e3e71a570:/# docker info
Client:
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 19.03.13
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 8fba4e9a7d01810a393d5d25a3621dc101981175
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.16.11-arch1-1
 Operating System: Ubuntu 20.04.1 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 7.763GiB
 Name: bc4e3e71a570
 ID: 5LEB:NP5C:2XKJ:P4GF:OLVS:SIUE:6QDG:6TVR:RFX3:TUX6:N254:XCMG
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No kernel memory TCP limit support
WARNING: No oom kill disable support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
WARNING: No cpu shares suppor
ctalledo commented 2 years ago

Thanks; I see, it's Docker 19.03.13 which does not support cgroups v2 (we need Docker 20.10).

What container image are you using? The reference nestybox/gitlab-runner-docker image or some other one?

himekifee commented 2 years ago

Ya, I somehow got a realy old nestybox/gitlab-runner-docker latest acf0d0cbf731 16 months ago 1.03GB, but I just try doing this recently. Forgot where did I get it.

himekifee commented 2 years ago

Oh, that is the latest image on docker hub I think?

ctalledo commented 2 years ago

Yes, looks like it needs an update. Try this new one I just uploaded please:

nestybox/gitlab-runner-docker:21.10

If that works, I'll tag it as the latest.

By the way, the Dockerfile is here in case you want to update it or customize it in the future.

himekifee commented 2 years ago
Starting service docker:20.10.12-dind ...
Pulling docker image docker:20.10.12-dind ...
Using docker image sha256:1a42336ff683d7dadd320ea6fe9d93a5b101474346302d23f96c9b4546cb414d for docker:20.10.12-dind with digest docker@sha256:6f2ae4a5fd85ccf85cdd829057a34ace894d25d544e5e4d9f2e7109297fedf8d ...
ERROR: Preparation failed: Error response from daemon: Unknown runtime specified sysbox-runc (docker.go:392:0s)

I think this is inside sys container? Because I was using one layer off, so runner on host and it worked fine.

ctalledo commented 2 years ago

Something is wrong, because there is no sysbox-runc inside the system container itself (there is just the GitLab runner and Docker).

Did you follow the steps in section GitLab Runner & Docker in a System Container of the blog?

himekifee commented 2 years ago

I think I did. sysbox-runc is on the list

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc., v0.7.1-docker)

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 8
 Server Version: 20.10.12
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: sysbox-runc io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 39259a8f35919a0d02c9ecc2871ddd6ccf6a7c6e.m
 runc version: v1.1.0-0-g067aaf85
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.16.11-arch1-1
 Operating System: Arch Linux
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 7.763GiB
 Name: gitlab-runner-general
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

This is on the host. Also if you like, I can provide a tmate shell.

ctalledo commented 2 years ago

Is that config from the inner Docker? (i.e., the Docker inside the system container)

I meant that sysbox-runc should NOT be in the config of the inner Docker (i.e., because sysbox-runc lives at host level, there is no sysbox-runc inside the system container).

himekifee commented 2 years ago

Not really, the one above is on the host, with sysbox-runc.

root@a3464053f254:/# docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.7.1-docker)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 2
 Server Version: 20.10.12
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc version: v1.0.2-0-g52b36a2
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.16.11-arch1-1
 Operating System: Ubuntu 20.04.3 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 7.763GiB
 Name: a3464053f254
 ID: NFAT:FJD6:5Y6B:6RTS:JWQB:KEV2:JCXN:EWDX:OKXR:LAB2:FYCJ:V5CG
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

The one from runner(inside the docker) does not have sysbox-runc.

ctalledo commented 2 years ago

So in the GitLab runner config you showed initially, you have:

concurrent = 1
check_interval = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "my-docker-runner"
  url = "https://git.me.asdf/"
  token = "TOKEN"
  executor = "docker"
  [runners.custom_build_dir]
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    tls_verify = false
    image = "docker:20.10.12"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache", "/var/lib/docker"]
    shm_size = 0
    runtime = "sysbox-runc"

If the GitLab runner is running inside the Sysbox container, then there is no sysbox-runc in it's context, so you can't configure it with sybox-runc as above.

You need something like:

[[runners]]
    name = "syscont-runner-docker"
    url = "https://gitlab.com/"
    token = REGISTRATION_TOKEN
    executor = "docker"
    [runners.docker]
        tls_verify = false
        image = "docker:19.03.12"
        privileged = false
        disable_cache = false
        volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]
himekifee commented 2 years ago

Oh I forgot about that. Great thanks for your help.

ctalledo commented 2 years ago

Cool @himekifee, glad we got to the bottom of it. I've retagged the nestybox/gitlab-runner-docker:21.10 -> nestybox/gitlab-runner-docker:latest.

Thanks again for trying Sysbox, and let us know of any more questions.

himekifee commented 2 years ago

One more thing. The dind services run by the GitLab ci service seems to have some issue with kernel module

Waiting for services to be up and running...
*** WARNING: Service runner-y5eqdgvc-project-2-concurrent-0-f2ee7e19224d5c0b-docker-0 probably didn't start properly.
Health check error:
service "runner-y5eqdgvc-project-2-concurrent-0-f2ee7e19224d5c0b-docker-0-wait-for-service" timeout
Health check container logs:
Service container logs:
2022-03-01T21:58:09.167032100Z Generating RSA private key, 4096 bit long modulus (2 primes)
2022-03-01T21:58:10.724745979Z ...........................................................................................................................................................++++
2022-03-01T21:58:10.812820938Z ..............++++
2022-03-01T21:58:10.812924421Z e is 65537 (0x010001)
2022-03-01T21:58:10.834844304Z Generating RSA private key, 4096 bit long modulus (2 primes)
2022-03-01T21:58:10.859226065Z ..++++
2022-03-01T21:58:10.911865438Z .......++++
2022-03-01T21:58:10.912484161Z e is 65537 (0x010001)
2022-03-01T21:58:10.977818617Z Signature ok
2022-03-01T21:58:10.977843115Z subject=CN = docker:dind server
2022-03-01T21:58:10.978101831Z Getting CA Private Key
2022-03-01T21:58:11.001839418Z /certs/server/cert.pem: OK
2022-03-01T21:58:11.011923428Z Generating RSA private key, 4096 bit long modulus (2 primes)
2022-03-01T21:58:11.102356086Z ....++++
2022-03-01T21:58:11.875491836Z .................................................................++++
2022-03-01T21:58:11.875900116Z e is 65537 (0x010001)
2022-03-01T21:58:11.968095578Z Signature ok
2022-03-01T21:58:11.968154274Z subject=CN = docker:dind client
2022-03-01T21:58:11.969095869Z Getting CA Private Key
2022-03-01T21:58:12.006561012Z /certs/client/cert.pem: OK
2022-03-01T21:58:12.119617828Z ip: can't find device 'ip_tables'
2022-03-01T21:58:12.190466715Z ip_tables              36864  2 iptable_filter,iptable_nat
2022-03-01T21:58:12.190533149Z x_tables               57344  6 xt_conntrack,xt_MASQUERADE,xt_addrtype,iptable_filter,iptable_nat,ip_tables
2022-03-01T21:58:12.192446953Z modprobe: can't change directory to '/lib/modules': No such file or directory
2022-03-01T21:58:12.197917645Z mount: permission denied (are you root?)
2022-03-01T21:58:12.198278392Z Could not mount /sys/kernel/security.
2022-03-01T21:58:12.198312013Z AppArmor detection and --privileged mode might break.
2022-03-01T21:58:12.201996295Z mount: permission denied (are you root?)

But docker build and push job actually worked. Does it matter?

rodnymolina commented 2 years ago

This is probably a side effect of the host system using nftables whereas the inner docker is expecting iptables based forwarding. My understanding is that Docker is working to be fully nftable compatible, but AFAIK they are not quite there yet. You may want to switch your host to regular iptables and then try again:

sudo update-alternatives --set iptables /usr/sbin/iptables-legacy

https://unix.stackexchange.com/questions/657545/nftables-whitelisting-docker

iamkhalidbashir commented 1 year ago

Getting error

Certificate request self-signature ok
subject=CN = docker:dind server
/certs/server/cert.pem: OK
Certificate request self-signature ok
subject=CN = docker:dind client
/certs/client/cert.pem: OK
mount: permission denied (are you root?)
Could not mount /sys/kernel/security.
AppArmor detection and --privileged mode might break.
time="2023-07-22T09:16:10.894982397Z" level=info msg="Starting up"
time="2023-07-22T09:16:10.912057708Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
failed to load listeners: can't create unix socket /var/run/docker.sock: device or resource busy

Confirmed that the container has

"Runtime": "sysbox-runc",

via inspect

smarsching commented 1 year ago

@iamkhalidbashir You might want to check that you are not using version 0.6.1 of Sysbox CE. We had some trouble with that version and GitLab CI, which did not occur when using version 0.5.2 or version 0.6.2.

iamkhalidbashir commented 1 year ago

Thank you for reply. I am using 0.6.2

On Sat, 22 Jul 2023 at 6:56 PM Sebastian Marsching @.***> wrote:

@iamkhalidbashir https://github.com/iamkhalidbashir You might want to check that you are not using version 0.6.1 of Sysbox CE. We had some trouble with that version and GitLab CI, which did not occur when using version 0.5.2 or version 0.6.2.

β€” Reply to this email directly, view it on GitHub https://github.com/nestybox/sysbox/issues/206#issuecomment-1646589668, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGS5WW34PKEMTPV42L6F6NTXRPL2TANCNFSM4XNB7LZQ . You are receiving this because you were mentioned.Message ID: @.***>

-- Mr. Bashir, CEO, AMOXT Pvt. Ltd

ctalledo commented 1 year ago

Hi @iamkhalidbashir, can you provide a bit more detail? which of the setups described in the article are you using, and in what step does the failure occur.

Thanks.

hanserasmus commented 10 months ago

Hi all. First off thank you for this great software. It saved my hind-end. I do however have an issue I am struggling to wrap my head around. I have a gitlab pipeline in which I have a docker image, and dind service container attached. Before using sysbox, when I executed the pipeline, I could do a docker ps to get the currently running container (which I assumed was the dind container) id, then use that container id to attach it to a network I created myself. This network is then used in a docker-compose file inside the docker container, to have all the nodes, and the container itself, be able to communicate with each other.

So the steps were: Set up the image and the dind container

....
test-topology:
  image: $CI_REGISTRY/systems/base-containers/docker:latest
  stage: test-topology
  rules:
    - if: $CI_COMMIT_BRANCH == "develop"
  tags:
    - docker
  services:
    - name: $CI_REGISTRY/systems/base-containers/main-dind:latest
      alias: docker
...

then I would get the running container's ID from within the Pipeline. Once I have the ID stored in a variable, I would create the new network and attach the running container to it.

....
    - container_id=$(docker ps | tail -n1 | awk -F' ' '{ print $1 }')
    - docker network create kafka
    - docker network connect kafka $container_id
    - docker compose up -d
 ...

From here I could reach all the nodes on service names which is important, because the java clients need to get bootstrap servers etc from cluster, so using 'docker' or 'localhost' everywhere is not going to work.

After switching to sysbox, if I run that docker ps command, the output is empty. Which is sort of understandable, as everything is now more isolated. But I am left with the issue of trying to attach my running container to the same network the stack is deployed on. I will take any help I can at this point. This is now day 4 of this problem... TIA

EDIT: I should add I installed sysbox on Ubuntu 22.04, and the gitlab-runner is also a debian package installation. In the config.toml of the runner I just changed the runtime setting. So I am not using the GitLab docker runner.