containers / toolbox

Tool for interactive command line environments on Linux
https://containertoolbx.org/
Apache License 2.0
2.57k stars 219 forks source link

Toolbox fails to start with disabled Avahi daemon socket #1590

Open mihalyr opened 2 days ago

mihalyr commented 2 days ago

Describe the bug

I've recently disabled the avahi-daemon.service and avahi-daemon.socket units, because I don't need Avahi (I only need to resolve mDNS and I can do that with systemd-resolved already). After this I can't start my Fedora Toolbox anymore.

The toolbox enter does not tell me what the problem is:

toolbox enter debug logs ``` ➜ ~ toolbox list IMAGE ID IMAGE NAME CREATED af6aadf6f749 registry.fedoraproject.org/fedora-toolbox:41 8 days ago CONTAINER ID CONTAINER NAME CREATED STATUS IMAGE NAME 32544e8dca96 fedora-toolbox-41 7 days ago exited registry.fedoraproject.org/fedora-toolbox:41 ➜ ~ toolbox --log-level debug --log-podman enter fedora-toolbox-41 DEBU Running as real user ID 3000 DEBU Resolved absolute path to the executable as /usr/bin/toolbox DEBU Running on a cgroups v2 host DEBU Looking up sub-GID and sub-UID ranges for user fedorauser DEBU TOOLBX_DELAY_ENTRY_POINT is DEBU TOOLBX_FAIL_ENTRY_POINT is DEBU TOOLBOX_PATH is /usr/bin/toolbox DEBU Migrating to newer Podman DEBU Toolbx config directory is /var/home/fedorauser/.config/toolbox INFO[0000] podman filtering at log level debug DEBU[0000] Called version.PersistentPreRunE(podman --log-level debug version --format json) DEBU[0000] Using conmon: "/usr/bin/conmon" INFO[0000] Using boltdb as database backend DEBU[0000] Initializing boltdb state at /var/home/fedorauser/.local/share/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver overlay DEBU[0000] Using graph root /var/home/fedorauser/.local/share/containers/storage DEBU[0000] Using run root /run/user/3000/containers DEBU[0000] Using static dir /var/home/fedorauser/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/3000/libpod/tmp DEBU[0000] Using volume path /var/home/fedorauser/.local/share/containers/storage/volumes DEBU[0000] Using transient store: false DEBU[0000] Not configuring container store DEBU[0000] Initializing event backend journald DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument DEBU[0000] Configured OCI runtime crun-vm initialization failed: no valid executable found for OCI runtime crun-vm: invalid argument DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument DEBU[0000] Using OCI runtime "/usr/bin/crun" INFO[0000] Setting parallel job count to 37 DEBU[0000] Called version.PersistentPostRunE(podman --log-level debug version --format json) DEBU[0000] Shutting down engines DEBU Current Podman version is 5.2.5 DEBU Creating runtime directory /run/user/3000/toolbox DEBU Old Podman version is 5.2.5 DEBU Migration not needed: Podman version 5.2.5 is unchanged DEBU Setting up configuration DEBU Setting up configuration: file /var/home/fedorauser/.config/containers/toolbox.conf not found DEBU Resolving container and image names DEBU Container: '' DEBU Distribution (CLI): '' DEBU Image (CLI): '' DEBU Release (CLI): '' DEBU Resolved container and image names DEBU Container: 'fedora-toolbox-41' DEBU Image: 'fedora-toolbox:41' DEBU Release: '41' DEBU Resolving container and image names DEBU Container: 'fedora-toolbox-41' DEBU Distribution (CLI): '' DEBU Image (CLI): '' DEBU Release (CLI): '' DEBU Resolved container and image names DEBU Container: 'fedora-toolbox-41' DEBU Image: 'fedora-toolbox:41' DEBU Release: '41' DEBU Checking if container fedora-toolbox-41 exists INFO[0000] podman filtering at log level debug DEBU[0000] Called exists.PersistentPreRunE(podman --log-level debug container exists fedora-toolbox-41) DEBU[0000] Using conmon: "/usr/bin/conmon" INFO[0000] Using boltdb as database backend DEBU[0000] Initializing boltdb state at /var/home/fedorauser/.local/share/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver overlay DEBU[0000] Using graph root /var/home/fedorauser/.local/share/containers/storage DEBU[0000] Using run root /run/user/3000/containers DEBU[0000] Using static dir /var/home/fedorauser/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/3000/libpod/tmp DEBU[0000] Using volume path /var/home/fedorauser/.local/share/containers/storage/volumes DEBU[0000] Using transient store: false DEBU[0000] [graphdriver] trying provided driver "overlay" DEBU[0000] Cached value indicated that overlay is supported DEBU[0000] Cached value indicated that overlay is supported DEBU[0000] Cached value indicated that metacopy is not being used DEBU[0000] Cached value indicated that native-diff is usable DEBU[0000] backingFs=btrfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false DEBU[0000] Initializing event backend journald DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument DEBU[0000] Configured OCI runtime crun-vm initialization failed: no valid executable found for OCI runtime crun-vm: invalid argument DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument DEBU[0000] Using OCI runtime "/usr/bin/crun" INFO[0000] Setting parallel job count to 37 DEBU[0000] Called exists.PersistentPostRunE(podman --log-level debug container exists fedora-toolbox-41) DEBU[0000] Shutting down engines INFO[0000] Received shutdown.Stop(), terminating! PID=18715 DEBU Inspecting container fedora-toolbox-41 INFO[0000] podman filtering at log level debug DEBU[0000] Called inspect.PersistentPreRunE(podman --log-level debug inspect --format json --type container fedora-toolbox-41) DEBU[0000] Using conmon: "/usr/bin/conmon" INFO[0000] Using boltdb as database backend DEBU[0000] Initializing boltdb state at /var/home/fedorauser/.local/share/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver overlay DEBU[0000] Using graph root /var/home/fedorauser/.local/share/containers/storage DEBU[0000] Using run root /run/user/3000/containers DEBU[0000] Using static dir /var/home/fedorauser/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/3000/libpod/tmp DEBU[0000] Using volume path /var/home/fedorauser/.local/share/containers/storage/volumes DEBU[0000] Using transient store: false DEBU[0000] [graphdriver] trying provided driver "overlay" DEBU[0000] Cached value indicated that overlay is supported DEBU[0000] Cached value indicated that overlay is supported DEBU[0000] Cached value indicated that metacopy is not being used DEBU[0000] Cached value indicated that native-diff is usable DEBU[0000] backingFs=btrfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false DEBU[0000] Initializing event backend journald DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument DEBU[0000] Configured OCI runtime crun-vm initialization failed: no valid executable found for OCI runtime crun-vm: invalid argument DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument DEBU[0000] Using OCI runtime "/usr/bin/crun" INFO[0000] Setting parallel job count to 37 DEBU[0000] Looking up image "af6aadf6f749a20c35d4bb049c7a01483bf12013e9a422268ac659437f6d1d34" in local containers storage DEBU[0000] Trying "af6aadf6f749a20c35d4bb049c7a01483bf12013e9a422268ac659437f6d1d34" ... DEBU[0000] parsed reference into "[overlay@/var/home/fedorauser/.local/share/containers/storage+/run/user/3000/containers]@af6aadf6f749a20c35d4bb049c7a01483bf12013e9a422268ac659437f6d1d34" DEBU[0000] Found image "af6aadf6f749a20c35d4bb049c7a01483bf12013e9a422268ac659437f6d1d34" as "af6aadf6f749a20c35d4bb049c7a01483bf12013e9a422268ac659437f6d1d34" in local containers storage DEBU[0000] Found image "af6aadf6f749a20c35d4bb049c7a01483bf12013e9a422268ac659437f6d1d34" as "af6aadf6f749a20c35d4bb049c7a01483bf12013e9a422268ac659437f6d1d34" in local containers storage ([overlay@/var/home/fedorauser/.local/share/containers/storage+/run/user/3000/containers]@af6aadf6f749a20c35d4bb049c7a01483bf12013e9a422268ac659437f6d1d34) DEBU[0000] Called inspect.PersistentPostRunE(podman --log-level debug inspect --format json --type container fedora-toolbox-41) DEBU[0000] Shutting down engines INFO[0000] Received shutdown.Stop(), terminating! PID=18732 DEBU Entry point of container fedora-toolbox-41 is toolbox (PID=0) DEBU Inspecting mounts of container fedora-toolbox-41 DEBU Generating Container Device Interface for NVIDIA DEBU Generating Container Device Interface for NVIDIA: Management Library not found: could not load NVML library: libnvidia-ml.so.1: cannot open shared object file: No such file or directory DEBU Generating Container Device Interface for NVIDIA: not a Tegra system: /sys/devices/soc0/family file not found DEBU Generating Container Device Interface for NVIDIA: skipping DEBU Starting container fedora-toolbox-41 Error: failed to start container fedora-toolbox-41 ```

However, when I try to start the container with Podman it reveals the problem:

Podman start debug logs ``` ➜ ~ podman start fedora-toolbox-41 Error: unable to start container "32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46": crun: cannot stat `/run/avahi-daemon/socket`: No such file or directory: OCI runtime attempted to invoke a command that was not found ➜ ~ podman --log-level debug start fedora-toolbox-41 INFO[0000] podman filtering at log level debug DEBU[0000] Called start.PersistentPreRunE(podman --log-level debug start fedora-toolbox-41) DEBU[0000] Using conmon: "/usr/bin/conmon" INFO[0000] Using boltdb as database backend DEBU[0000] Initializing boltdb state at /var/home/fedorauser/.local/share/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver overlay DEBU[0000] Using graph root /var/home/fedorauser/.local/share/containers/storage DEBU[0000] Using run root /run/user/3000/containers DEBU[0000] Using static dir /var/home/fedorauser/.local/share/containers/storage/libpod DEBU[0000] Using tmp dir /run/user/3000/libpod/tmp DEBU[0000] Using volume path /var/home/fedorauser/.local/share/containers/storage/volumes DEBU[0000] Using transient store: false DEBU[0000] [graphdriver] trying provided driver "overlay" DEBU[0000] Cached value indicated that overlay is supported DEBU[0000] Cached value indicated that overlay is supported DEBU[0000] Cached value indicated that metacopy is not being used DEBU[0000] Cached value indicated that native-diff is usable DEBU[0000] backingFs=btrfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false DEBU[0000] Initializing event backend journald DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument DEBU[0000] Configured OCI runtime crun-vm initialization failed: no valid executable found for OCI runtime crun-vm: invalid argument DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument DEBU[0000] Using OCI runtime "/usr/bin/crun" INFO[0000] Setting parallel job count to 37 DEBU[0000] Cached value indicated that idmapped mounts for overlay are not supported DEBU[0000] Check for idmapped mounts support DEBU[0000] overlay: mount_data=lowerdir=/var/home/fedorauser/.local/share/containers/storage/overlay/l/QEP6CVOSL3VBSNXWS6CWPLWGTS:/var/home/fedorauser/.local/share/containers/storage/overlay/l/QEP6CVOSL3VBSNXWS6CWPLWGTS/../diff1:/var/home/fedorauser/.local/share/containers/storage/overlay/l/LYC56CBF5DI55P3ECT4TJWQGO2,upperdir=/var/home/fedorauser/.local/share/containers/storage/overlay/fecdf439432ffb7d029f7358d1cc1150fe2b07db29c08106026f05d3e975a39b/diff,workdir=/var/home/fedorauser/.local/share/containers/storage/overlay/fecdf439432ffb7d029f7358d1cc1150fe2b07db29c08106026f05d3e975a39b/work,userxattr,context="system_u:object_r:container_file_t:s0:c1022,c1023" DEBU[0000] Mounted container "32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46" at "/var/home/fedorauser/.local/share/containers/storage/overlay/fecdf439432ffb7d029f7358d1cc1150fe2b07db29c08106026f05d3e975a39b/merged" DEBU[0000] Created root filesystem for container 32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46 at /var/home/fedorauser/.local/share/containers/storage/overlay/fecdf439432ffb7d029f7358d1cc1150fe2b07db29c08106026f05d3e975a39b/merged DEBU[0000] Not modifying container 32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46 /etc/passwd DEBU[0000] Not modifying container 32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46 /etc/group DEBU[0000] /etc/system-fips does not exist on host, not mounting FIPS mode subscription DEBU[0000] Setting Cgroups for container 32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46 to user.slice:libpod:32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46 DEBU[0000] Set root propagation to "rslave" DEBU[0000] reading hooks from /usr/share/containers/oci/hooks.d DEBU[0000] Workdir "/" resolved to host path "/var/home/fedorauser/.local/share/containers/storage/overlay/fecdf439432ffb7d029f7358d1cc1150fe2b07db29c08106026f05d3e975a39b/merged" DEBU[0000] Created OCI spec for container 32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46 at /var/home/fedorauser/.local/share/containers/storage/overlay-containers/32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46/userdata/config.json DEBU[0000] /usr/bin/conmon messages will be logged to syslog DEBU[0000] running conmon: /usr/bin/conmon args="[--api-version 1 -c 32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46 -u 32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46 -r /usr/bin/crun -b /var/home/fedorauser/.local/share/containers/storage/overlay-containers/32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46/userdata -p /run/user/3000/containers/overlay-containers/32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46/userdata/pidfile -n fedora-toolbox-41 --exit-dir /run/user/3000/libpod/tmp/exits --persist-dir /run/user/3000/libpod/tmp/persist/32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46 --full-attach -s -l journald --log-level debug --syslog --conmon-pidfile /run/user/3000/containers/overlay-containers/32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/home/fedorauser/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/3000/containers --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/user/3000/libpod/tmp --exit-command-arg --network-config-dir --exit-command-arg --exit-command-arg --network-backend --exit-command-arg netavark --exit-command-arg --volumepath --exit-command-arg /var/home/fedorauser/.local/share/containers/storage/volumes --exit-command-arg --db-backend --exit-command-arg boltdb --exit-command-arg --transient-store=false --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg --syslog --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46]" [conmon:d]: failed to write to /proc/self/oom_score_adj: Permission denied DEBU[0000] Received: -1 DEBU[0000] Cleaning up container 32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46 DEBU[0000] Network is already cleaned up, skipping... DEBU[0000] Unmounted container "32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46" Error: unable to start container "32544e8dca960648cd74604e07c2b7642028dfd4742b2913565559ad83505d46": crun: cannot stat `/run/avahi-daemon/socket`: No such file or directory: OCI runtime attempted to invoke a command that was not found DEBU[0000] Shutting down engines INFO[0000] Received shutdown.Stop(), terminating! PID=21221 ```

Steps how to reproduce the behaviour

Probably the following:

  1. Install, enable & start Avahi daemon/socket
  2. Create a new toolbox for Fedora 41
  3. Stop & disable Avahi daemon/socket (you might need to restart or just manually remove /run/avahi-daemon/socket)
  4. Try to start the same toolbox created before

Expected behaviour

Expect the toolbox to work also without Avahi.

Actual behaviour

The toolbox fails to start.

Screenshots

N/A

Output of toolbox --version (v0.0.90+)

toolbox version 0.1.0

Toolbx package info (rpm -q toolbox)

toolbox-0.1.0-1.fc41.x86_64

Output of podman version

Client:       Podman Engine
Version:      5.2.5
API Version:  5.2.5
Go Version:   go1.23.2
Built:        Fri Oct 18 02:00:00 2024
OS/Arch:      linux/amd64

Podman package info (rpm -q podman)

podman-5.2.5-1.fc41.x86_64

Info about your OS

Fedora 41 Sericea

Additional context

Re-enabling Avahi daemon/socket fixes the problem - I just don't want to use Avahi...

➜  ~ sudo systemctl enable --now avahi-daemon.service avahi-daemon.socket
Created symlink '/etc/systemd/system/dbus-org.freedesktop.Avahi.service' → '/usr/lib/systemd/system/avahi-daemon.service'.
Created symlink '/etc/systemd/system/multi-user.target.wants/avahi-daemon.service' → '/usr/lib/systemd/system/avahi-daemon.service'.
Created symlink '/etc/systemd/system/sockets.target.wants/avahi-daemon.socket' → '/usr/lib/systemd/system/avahi-daemon.socket'.
➜  ~ tbe
(fedora-toolbox-41) ➜  ~ 
mihalyr commented 2 days ago

Here is a full reproducer:

First make it running

podman kill fedora-toolbox-41
pkill conmon
sudo systemctl enable --now avahi-daemon.{service,socket}
toolbox enter fedora-toolbox-41

Then break it

sudo systemctl disable --now avahi-daemon.{service,socket}
sudo kill -9 $(cat /run/avahi-daemon/pid)
sudo rm -rf /run/avahi-daemon
pkill conmon
toolbox enter fedora-toolbox-41

This gives the error:

Error: failed to start container fedora-toolbox-41

Then fix it again:

podman kill fedora-toolbox-41
pkill conmon
sudo systemctl enable --now avahi-daemon.{service,socket}
toolbox enter fedora-toolbox-41

And break it again

sudo systemctl disable --now avahi-daemon.{service,socket}
sudo kill -9 $(cat /run/avahi-daemon/pid)
sudo rm -rf /run/avahi-daemon
pkill conmon
toolbox enter fedora-toolbox-41

I think, this is only some issue with some leftover state.

mihalyr commented 2 days ago

I just wanted to add that creating a new container with Avahi disabled works, but it's a pain to always have to reinstall everything and I just did that recently when upgrading to F41.

So after breaking it with

sudo systemctl disable --now avahi-daemon.{service,socket}
sudo kill -9 $(cat /run/avahi-daemon/pid)
sudo rm -rf /run/avahi-daemon
pkill conmon

I can continue with

toolbox create fedora-toolbox-41-2
toolbox enter fedora-toolbox-41-2

And the new container works with Avahi disabled. So the issue title is a bit misleading, because toolbox does start and work with Avahi disabled, but seems to be failing to start with Avahi disabled if the container was created with Avahi enabled.

Is there any workaround to somehow remove the knowledge about Avahi from the existing container without reinstalling everything into a new container?

mihalyr commented 2 days ago

I see that it is a bind mount that the container added for the Avahi daemon socket. Hm, not sure if there is a way to modify an existing container's mount points or make the bind optional only if the path exists. This seems to be a limitation of the underlying container runtime and not toolbox.

I might need to recreate the container then anyway. Please feel free to close this issue, this might be actually working as expected, I just got a bit confused about how these containers work and what services they mount automatically.

               {
                    "Type": "bind",
                    "Source": "/run/avahi-daemon/socket",
                    "Destination": "/run/avahi-daemon/socket",
                    "Driver": "",
                    "Mode": "",
                    "Options": [
                         "nosuid",
                         "nodev",
                         "rbind"
                    ],
                    "RW": true,
                    "Propagation": "rprivate"
               },