Open fhaefemeier opened 2 years ago
Hi @fhaefemeier, thanks for finding Sysbox and giving it a shot.
Your timing is really good, we are just a few days away from the Sysbox v0.5.0 release which includes ID-mapped mounts support. However, the release will only have the .deb packages (for Ubuntu / Debian) and we won't have an RPM package for it for a few more weeks (we are actively working on this too).
The Sysbox upstream code already includes ID-mapped mounts support, so while we work on the RPM package, you can try Sysbox on Fedora (with kernel >= 5.12) by building it from source (it's pretty easy), but let us know if you need assistance.
Hi @ctalledo it good news.
The Sysbox upstream code already includes ID-mapped mounts support, so while we work on the RPM package, you can try Sysbox on Fedora (with kernel >= 5.12) by building it from source (it's pretty easy), but let us know if you need assistance.
You mean, it is already functional in the latest code available? I build yesterday sysbox on Fedora following your guide. So I have now a version which provide the new feature. Do I have to do something special or will it be enough to enable sysbox-runc without 'userns-remap' in docker daemon?
Hi @fhaefemeier,
You mean, it is already functional in the latest code available?
Yes that's right.
Do I have to do something special or will it be enough to enable sysbox-runc without 'userns-remap' in docker daemon?
Simply enable sysbox-runc without userns-remap.
Assuming you have a kernel >= 5.12, when you launch a container with Docker + Sysbox, it should work and you will see some ID-mapped mounts. E.g.:
$ docker run --runtime=sysbox-runc -it --rm ubuntu
root@62adea865594:/# findmnt | grep idmap
|-/etc/resolv.conf /dev/nvme1n1p1[/tmp/sysbox-test-var-lib/docker/containers/62adea8655942f1f382f441ba38c07f2d67ae29187fca9a3e99e53b1a3e6fb75/resolv.conf] ext4 rw,relatime,idmapped,errors=remount-ro
|-/etc/hostname /dev/nvme1n1p1[/tmp/sysbox-test-var-lib/docker/containers/62adea8655942f1f382f441ba38c07f2d67ae29187fca9a3e99e53b1a3e6fb75/hostname] ext4 rw,relatime,idmapped,errors=remount-ro
|-/etc/hosts /dev/nvme1n1p1[/tmp/sysbox-test-var-lib/docker/containers/62adea8655942f1f382f441ba38c07f2d67ae29187fca9a3e99e53b1a3e6fb75/hosts] ext4 rw,relatime,idmapped,errors=remount-ro
|-/usr/src/linux-headers-5.13.0-1017-aws /dev/root[/usr/src/linux-headers-5.13.0-1017-aws] ext4 ro,relatime,idmapped,discard,errors=remount-ro
|-/usr/src/linux-aws-headers-5.13.0-1017 /dev/root[/usr/src/linux-aws-headers-5.13.0-1017] ext4 ro,relatime,idmapped,discard,errors=remount-ro
`-/usr/lib/modules/5.13.0-1017-aws /dev/root[/usr/lib/modules/5.13.0-1017-aws] ext4 ro,relatime,idmapped,discard,errors=remount-ro
Let me know if you hit any issues please.
Let me know if you hit any issues please.
I installed sysbox on Fedora 35 with idmapped_mount (see also #513) and start a system container with docker run --runtime=sysbox-runc --rm -it --hostname my_cont debian:latest
. It started successfully. But I have a strange behaviour, not sure if it is related to idmapped mounts or something different (I can open another ticket if you want).
The start of a system container is only successful after several failure starts. Failure ends with the message
docker: Error response from daemon: OCI runtime create failed: error in the container spec: failed to request rootfs cloning from sysbox-mgr: failed to invoke ReqCloneRootfs via grpc: rpc error: code = Unknown desc = failed to mount clone for container 489a56472eb8: failed to set up bottom ovfs mount: failed to mount overlayfs on /srv/container/sysbox/rootfs/489a56472eb8f43a157939a69443ee115f87be5d8324fc07358142dd4afbaa7f/bottom/merged: invalid argument: unknown.
I installed sysbox-mgr
and sysbox-fs
with default parameter. Only data-root
is changed to /srv/container/sysbox
instead of /var/lib/sysbox
. After starting both daemons log shows:
Mär 20 20:01:05 homecloud systemd[1]: Starting sysbox-fs (part of the Sysbox container runtime)...
Mär 20 20:01:05 homecloud sysbox-fs[1391169]: {"level":"info","msg":"Initiating sysbox-fs ...","time":"2022-03-20 20:01:05"}
Mär 20 20:01:05 homecloud sysbox-fs[1391169]: {"level":"info","msg":"Initializing with 'allow-immutable-remounts' knob disabled (default)","time":"2022-03-20 20:01:05"}
Mär 20 20:01:05 homecloud sysbox-fs[1391169]: {"level":"info","msg":"Initializing with 'allow-immutable-unmounts' knob enabled (default)","time":"2022-03-20 20:01:05"}
Mär 20 20:01:05 homecloud sysbox-fs[1391169]: {"level":"info","msg":"FUSE dir = /srv/container/sysboxfs","time":"2022-03-20 20:01:05"}
Mär 20 20:01:05 homecloud sysbox-fs[1391169]: {"level":"info","msg":"IOvec memParser elected","time":"2022-03-20 20:01:05"}
Mär 20 20:01:05 homecloud sysbox-fs[1391169]: {"level":"info","msg":"Listening on /run/sysbox/sysfs.sock","time":"2022-03-20 20:01:05"}
Mär 20 20:01:05 homecloud sysbox-fs[1391169]: {"level":"info","msg":"Ready ...","time":"2022-03-20 20:01:05"}
Mär 20 20:01:05 homecloud systemd[1]: Started sysbox-fs (part of the Sysbox container runtime).
Mär 20 20:01:05 homecloud systemd[1]: Starting sysbox-mgr (part of the Sysbox container runtime)...
Mär 20 20:01:05 homecloud sysbox-mgr[1391149]: {"level":"info","msg":"Starting ...","time":"2022-03-20 20:01:05"}
Mär 20 20:01:05 homecloud sysbox-mgr[1391149]: {"level":"info","msg":"Sysbox data root: /srv/container/sysbox","time":"2022-03-20 20:01:05"}
Mär 20 20:01:05 homecloud sysbox-mgr[1391149]: {"level":"warning","msg":"failed to cleanup /srv/container/sysbox: unlinkat /srv/container/sysbox: device or resource busy","time":"2022-03>
Mär 20 20:01:05 homecloud sysbox-mgr[1391149]: {"level":"info","msg":"Listening on /run/sysbox/sysmgr.sock","time":"2022-03-20 20:01:05"}
Mär 20 20:01:05 homecloud sysbox-mgr[1391149]: {"level":"info","msg":"Ready ...","time":"2022-03-20 20:01:05"}
Mär 20 20:01:05 homecloud systemd[1]: Started sysbox-mgr (part of the Sysbox container runtime).
Let me know how I can help.
Maybe related. Found in kernel log
Mär 20 21:18:53 homecloud kernel: overlayfs: unrecognized mount option "c693"" or missing value
Thanks @fhaefemeier.
Only data-root is changed to /srv/container/sysbox instead of /var/lib/sysbox
Could you please try with the default data-root (i.e., /var/lib/sysbox
)? It should work either way, but I wonder if we have a bug that we've not caught.
Modified the title to make it a bit more specific to this issue.
I changed the config and used the default data-root (sysbox-mgr) and mountpoint (sysbox-fs) and have the same behaviour. Before retesting I update to the latest git version (master branch). I forgot to mention in my original comment, I use systemd unit files to start sysbox-mgr
and sysbox-fs
. I took the unit files from debian package as example.
Log entry sysbox-mgr
Mär 25 23:58:51 homecloud sysbox-mgr[676730]: {"level":"info","msg":"registered new container 9ada1c4eff78","time":"2022-03-25 23:58:51"}
Mär 25 23:58:51 homecloud sysbox-mgr[676730]: {"level":"info","msg":"unregistered container 9ada1c4eff78","time":"2022-03-25 23:58:51"}
Mär 25 23:58:51 homecloud sysbox-mgr[676730]: {"level":"warning","msg":"failed to unbind cloned rootfs for container 9ada1c4eff78: failed to unmount clone for container 9ada1c4eff78: failed to remove top mount: invalid argument","time":"2022-03-25 23:58:51"}
Mär 25 23:58:51 homecloud sysbox-mgr[676730]: {"level":"info","msg":"released resources for container 9ada1c4eff78","time":"2022-03-25 23:58:51"}
Log entry sysbox-fs
Mär 25 23:58:51 homecloud sysbox-fs[676750]: {"level":"info","msg":"Container pre-registration completed: id = 9ada1c4eff78","time":"2022-03-25 23:58:51"}
Mär 25 23:58:51 homecloud sysbox-fs[676750]: {"level":"info","msg":"Container unregistration completed: id = 9ada1c4eff78","time":"2022-03-25 23:58:51"}
I will (for different reasons) reboot my server tomorrow and will have a clean setup. I will report my experience.
No news after server reboot. Still same behaviour. How can I support you?
Hi @fhaefemeier, let me take a look in the next couple of days so we can get to the bottom of this.
Hi @fhaefemeier, finally got a chance to take a closer look.
I created a Fedora 35 VM on Google Compute Engine (GCE), the cloned the Sysbox GitHub repo, and did a make test-shell
to get a shell inside the Sysbox test container, and after this was able to create containers without problem.
I did hit a few setup issues (which you probably run into also):
1) I had to create a Sysbox test container Dockerfile for Fedora 35 (see this commit).
2) The Sysbox Makefile had a dependency on lsb_release
which required me to install the redhat-lsb-core
core package on the host machine. I committed a change to remove this requirement going forward.
3) The Fedora-35 VM instance that I used as my host came with /
mounted on a disk formatted with btrfs
(rather than ext4
). This is an issue because we don't officially support btrfs yet (we've not done much testing on it), and ID-mapped mounts don't yet work on btrfs (a Linux kernel limitation). To overcome this, I added an ext4 disk to the VM and pointed both the Docker data-root and the Sysbox data-root to that disk.
After this, I was able to do a make test-shell
which gets me a shell inside the test container, and from then was able to deploy containers with Docker + Sysbox without problem. This gives me confidence Sysbox works well on Fedora 35.
Then I took the next step and installed Sysbox on the Fedora 35 VM directly, as follows:
1) Install the fuse
package on the host (Sysbox requires it).
$ dnf install -y fuse
2) Build the Sysbox binaries and install them:
$ make sysbox && make install
3) Start Sysbox:
$ ./scr/sysbox
4) Configure Docker with Sysbox (for this I used the convenience script in the Sysbox repo which does the config in /etc/docker/daemon.json
and restarts Docker):
$ ./scr/docker-cfg --sysbox-runtime=enable
After this, I was able to run Sysbox containers directly on the Fedora 35 host as follows:
[vagrant@instance-1 sysbox]$ cat /etc/os-release | egrep "^NAME|^VERSION"
NAME="Fedora Linux"
VERSION="35 (Cloud Edition)"
VERSION_ID=35
[vagrant@instance-1 sysbox]$ docker run --runtime=sysbox-runc -it --rm nestybox/ubuntu-focal-systemd-docker
Welcome to Ubuntu 20.04.2 LTS!
[ OK ] Created slice system-getty.slice.
[ OK ] Created slice system-modprobe.slice.
[ OK ] Created slice User and Session Slice.
...
[ OK ] Reached target Graphical Interface.
Starting Update UTMP about System Runlevel Changes...
[ OK ] Finished Update UTMP about System Runlevel Changes.
Ubuntu 20.04.2 LTS 030f7f2a79bf console
030f7f2a79bf login: admin
Password:
Welcome to Ubuntu 20.04.2 LTS (GNU/Linux 5.14.10-300.fc35.x86_64 x86_64)
...
admin@030f7f2a79bf:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
admin@030f7f2a79bf:~$ docker run -it alpine
Unable to find image 'alpine:latest' locally
latest: Pulling from library/alpine
40e059520d19: Pull complete
Digest: sha256:f22945d45ee2eb4dd463ed5a431d9f04fcd80ca768bb1acf898d91ce51f7bf04
Status: Downloaded newer image for alpine:latest
/ #
As you can see, I am not hitting the same error you got (not sure why). Please try to follow the steps above and let me know if this fixes it or not. Also, feel free to join the Sysbox slack channel, as that may be a better forum to get to the bottom of the issue you are facing.
@ctalledo Thanks providing your test scenario. I will check it, but it will take a little bit of time. I will keep you informed. I use XFS as filesystem (md raid/LVM), if it is important...
One question, did you repeat creating sysbox containers (in short frequence)? In my case, for a test, I called a docker run
several times one after another and aprox. three of ten are working...
Hi @fhaefemeier,
I use XFS as filesystem (md raid/LVM), if it is important...
That could matter; in my VM I have btrfs
or ext4
.
did you repeat creating sysbox containers (in short frequence)? In my case, for a test, I called a docker run several times one after another and aprox. three of ten are working...
Yes, I see no problem:
[vagrant@instance-1 sysbox]$ for i in $(seq 1 10); do docker run --runtime=sysbox-runc -d --rm ghcr.io/nestybox/ubuntu-focal-systemd-docker; done
50eef94fac681c353d60dc3903ef449cbc165827d8faad32ed34d42e2b2df3bc
50a8b01326c0d0adc29bccaab1605c02980b9aaf1a108ef90153b20162d4228b
a53e04092089e342c9edd5c3e37580990295e4157418bca1649032d9f0567d55
59f99b09f5170b10a986eeadfe8dc1bab8d1e0889a6f882ebe0c3b1c24978eb0
244cb65d8bc3932628b4b5a820a2a5655a6362ca55a3a486779cf543402822da
b874a9870c16bf9b6db3533a7036befb69ce55df085b54dc442bbcf4982091b9
bfbaefaca8c6c0489b73e656a07350c71c369ae971ea429f4b718ba18accf03b
1d565603b68dae93681b2adf4e60d0a7ab069eb3f9f83571dacea778b756424e
538fc3ae59ae309142f4846b445ba005d83dfcfb5d1f8466b31e1b25ff55506d
ad3e89012b3c9f9e6d38b723a84f1f24846ffae6d52d9e876567be727610604c
[vagrant@instance-1 sysbox]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ad3e89012b3c ghcr.io/nestybox/ubuntu-focal-systemd-docker "/sbin/init --log-le…" 4 seconds ago Up 2 seconds 22/tcp sleepy_keldysh
538fc3ae59ae ghcr.io/nestybox/ubuntu-focal-systemd-docker "/sbin/init --log-le…" 6 seconds ago Up 4 seconds 22/tcp ecstatic_greider
1d565603b68d ghcr.io/nestybox/ubuntu-focal-systemd-docker "/sbin/init --log-le…" 8 seconds ago Up 6 seconds 22/tcp nifty_fermi
bfbaefaca8c6 ghcr.io/nestybox/ubuntu-focal-systemd-docker "/sbin/init --log-le…" 10 seconds ago Up 8 seconds 22/tcp quizzical_bartik
b874a9870c16 ghcr.io/nestybox/ubuntu-focal-systemd-docker "/sbin/init --log-le…" 13 seconds ago Up 9 seconds 22/tcp boring_hawking
244cb65d8bc3 ghcr.io/nestybox/ubuntu-focal-systemd-docker "/sbin/init --log-le…" 16 seconds ago Up 13 seconds 22/tcp thirsty_driscoll
59f99b09f517 ghcr.io/nestybox/ubuntu-focal-systemd-docker "/sbin/init --log-le…" 18 seconds ago Up 15 seconds 22/tcp inspiring_brahmagupta
a53e04092089 ghcr.io/nestybox/ubuntu-focal-systemd-docker "/sbin/init --log-le…" 19 seconds ago Up 17 seconds 22/tcp peaceful_burnell
50a8b01326c0 ghcr.io/nestybox/ubuntu-focal-systemd-docker "/sbin/init --log-le…" 21 seconds ago Up 19 seconds 22/tcp fervent_austin
50eef94fac68 ghcr.io/nestybox/ubuntu-focal-systemd-docker "/sbin/init --log-le…" 22 seconds ago Up 20 seconds 22/tcp charming_khorana
@ctalledo I had the chance to test it in a separate environment (Fedora 35 installed in a qemu VM). make test-shell
was successful and installed sysbox inside the VM following your steps (1-4). The installation use XFS as filesystem and everything (docker, sysbox) use default parameter (e.g. data-root for docker and sysbox).
[sysbox@fedora sysbox]$ cat /etc/os-release | egrep "^NAME|^VERSION"
NAME="Fedora Linux"
VERSION="35 (Server Edition)"
VERSION_ID=35
Creating a sysbox with docker run --runtime=sysbox-runc -it --rm nestybox/ubuntu-focal-systemd-docker
was successful without errors and repeatable.
But there is one important difference to my original host system. Selinux is enabled in docker daemon. If I enable it in the VM I have the same/similar error scenarios. At the end system containers can't be created.
[sysbox@fedora sysbox]$ docker run --runtime=sysbox-runc -it --rm nestybox/ubuntu-focal-systemd-docker
docker: Error response from daemon: failed to create shim: OCI runtime create failed: error in the container spec: failed to request rootfs cloning from sysbox-mgr: failed to invoke ReqCloneRootfs via grpc: rpc error: code = Unknown desc = failed to mount clone for container eb2df2642d49: failed to set up bottom ovfs mount: failed to mount overlayfs on /var/lib/sysbox/rootfs/eb2df2642d49f1f2f937fc53f381e77a1ed241d42746dab5d76c84e4810e75ea/bottom/merged: invalid argument: unknown.
Or different error
[sysbox@fedora sysbox]$ docker run --runtime=sysbox-runc -it --rm nestybox/ubuntu-focal-systemd-docker
Welcome to Ubuntu 20.04.2 LTS!
Failed to create /init.scope control group: Permission denied
Failed to allocate manager object: Permission denied
[!!!!!!] Failed to allocate manager object.
Exiting PID 1...
Hi @fhaefemeier,
Thanks for the update.
But there is one important difference to my original host system. Selinux is enabled in docker daemon. If I enable it in the VM I have the same/similar error scenarios. At the end system containers can't be created.
How exactly did you enable SELinux in the VM? I want to see if I can repro on my Fedora 35 host.
Thanks.
If you have done a standard Fedora installation, SELinux is running in enforcing mode (default mode). You can check it in /etc/selinux/config
. Additionally the docker daemon is configured with
{
"log-driver": "journald",
"selinux-enabled": true,
"runtimes": {
"sysbox-runc": {
"path": "/usr/local/bin/sysbox-runc"
}
}
}
You can check the SELinux labels with ls -lZ
. There are different labels. The main labels (known and used by me) are
container_file_t
container_var_lib_t
But there are more, for sure. An older post on stack overflow gives some hints.
Any news?
Hi @fhaefemeier, my apologies, did not get a chance to check the SELinux part yet; will try to get to it next week.
I found sysbox and was happy, because it support my use cases to setup a CI environment without the limitations and security implications with DinD or similar setup. Thanks for this project and your commitment.
I started to read the documentation and tried to provide the sysbox on my host (Fedora based). I found a few minor issues during source build (separate issues will be coming). Now I started to install and configure it.
But, I am at the point to configure docker and the system and it had stopped me. Because of the change to enable userns-remap of docker. It has a major impact on my running host system and their installed containers. I will try to install shiftfs from source, but I would like to ask, when will "... soon ..." be reached.
In the documentation I found the sentence "In the near future (kernels 5.12+), shiftfs is expected to be replaced ... Sysbox will soon have support for this." Are there any plans (time schedule) to enable this feature. It would help very much (hopefully) to reduce the changes on the host system