containers / storage

Container Storage Library
Apache License 2.0
563 stars 243 forks source link

Orphaned fuse-overlayfs processes (mount_program) #1654

Open phemmer opened 1 year ago

phemmer commented 1 year ago

So I was basically doing the same thing as explained in this guide, setting in /etc/containers/storage.conf to:

[storage.options.overlay]
mount_program = "/usr/bin/fuse-overlayfs"

However this results in a fuse-overlayfs process spawning, and then never being terminated, even after the container using it has been removed.

For example:

# pgrep -u gitlab-runner fuse-overlayfs|wc -l
898

# for i in {1..10}; do sudo -u gitlab-runner podman run --rm -ti busybox echo; done

# pgrep -u gitlab-runner fuse-overlayfs|wc -l
908
# podman version
Client:       Podman Engine
Version:      4.5.0
API Version:  4.5.0
Go Version:   go1.19.8
Built:        Thu Jan  1 00:00:00 1970
OS/Arch:      linux/amd64

Running on debian bullseye with the packages built from https://gitlab.com/rhcontainerbot/rpms-openqa using debbuild

# sudo -u gitlab-runner podman run --log-level=debug --rm -ti busybox echo
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called run.PersistentPreRunE(podman run --log-level=debug --rm -ti busybox echo) 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /home/gitlab-runner/.local/share/containers/storage/libpod/bolt_state.db 
DEBU[0000] Overriding run root "/run/user/997/containers" with "/tmp/containers-user-997/containers" from database 
DEBU[0000] Overriding tmp dir "/run/user/997/libpod/tmp" with "/tmp/run-997/libpod/tmp" from database 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /home/gitlab-runner/.local/share/containers/storage 
DEBU[0000] Using run root /tmp/containers-user-997/containers 
DEBU[0000] Using static dir /home/gitlab-runner/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /tmp/run-997/libpod/tmp        
DEBU[0000] Using volume path /home/gitlab-runner/.local/share/containers/storage/volumes 
DEBU[0000] Using transient store: false                 
DEBU[0000] Not configuring container store              
DEBU[0000] Initializing event backend file              
DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument 
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument 
DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument 
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument 
DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Setting parallel job count to 145            
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called run.PersistentPreRunE(podman run --log-level=debug --rm -ti busybox echo) 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /home/gitlab-runner/.local/share/containers/storage/libpod/bolt_state.db 
DEBU[0000] Overriding run root "/run/user/997/containers" with "/tmp/containers-user-997/containers" from database 
DEBU[0000] Overriding tmp dir "/run/user/997/libpod/tmp" with "/tmp/run-997/libpod/tmp" from database 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /home/gitlab-runner/.local/share/containers/storage 
DEBU[0000] Using run root /tmp/containers-user-997/containers 
DEBU[0000] Using static dir /home/gitlab-runner/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /tmp/run-997/libpod/tmp        
DEBU[0000] Using volume path /home/gitlab-runner/.local/share/containers/storage/volumes 
DEBU[0000] Using transient store: false                 
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] overlay: imagestore=/var/lib/containers-shared 
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs 
DEBU[0000] backingFs=extfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument 
DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument 
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument 
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument 
DEBU[0000] Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Setting parallel job count to 145            
DEBU[0000] Successfully loaded 1 networks               
DEBU[0000] Pulling image busybox (policy: missing)      
DEBU[0000] Looking up image "busybox" in local containers storage 
DEBU[0000] Normalized platform linux/amd64 to {amd64 linux  [] } 
DEBU[0000] Loading registries configuration "/etc/containers/registries.conf" 
DEBU[0000] Loading registries configuration "/etc/containers/registries.conf.d/000-shortnames.conf" 
DEBU[0000] Trying "docker.io/library/busybox:latest" ... 
DEBU[0000] parsed reference into "[overlay@/home/gitlab-runner/.local/share/containers/storage+/tmp/containers-user-997/containers:overlay.imagestore=/var/lib/containers-shared,overlay.mount_program=/usr/bin/fuse-overlayfs]@7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9" 
DEBU[0000] Found image "busybox" as "docker.io/library/busybox:latest" in local containers storage 
DEBU[0000] Found image "busybox" as "docker.io/library/busybox:latest" in local containers storage ([overlay@/home/gitlab-runner/.local/share/containers/storage+/tmp/containers-user-997/containers:overlay.imagestore=/var/lib/containers-shared,overlay.mount_program=/usr/bin/fuse-overlayfs]@7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9) 
DEBU[0000] exporting opaque data as blob "sha256:7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9" 
DEBU[0000] Looking up image "docker.io/library/busybox:latest" in local containers storage 
DEBU[0000] Normalized platform linux/amd64 to {amd64 linux  [] } 
DEBU[0000] Trying "docker.io/library/busybox:latest" ... 
DEBU[0000] parsed reference into "[overlay@/home/gitlab-runner/.local/share/containers/storage+/tmp/containers-user-997/containers:overlay.imagestore=/var/lib/containers-shared,overlay.mount_program=/usr/bin/fuse-overlayfs]@7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9" 
DEBU[0000] Found image "docker.io/library/busybox:latest" as "docker.io/library/busybox:latest" in local containers storage 
DEBU[0000] Found image "docker.io/library/busybox:latest" as "docker.io/library/busybox:latest" in local containers storage ([overlay@/home/gitlab-runner/.local/share/containers/storage+/tmp/containers-user-997/containers:overlay.imagestore=/var/lib/containers-shared,overlay.mount_program=/usr/bin/fuse-overlayfs]@7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9) 
DEBU[0000] exporting opaque data as blob "sha256:7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9" 
DEBU[0000] Looking up image "busybox" in local containers storage 
DEBU[0000] Normalized platform linux/amd64 to {amd64 linux  [] } 
DEBU[0000] Trying "docker.io/library/busybox:latest" ... 
DEBU[0000] parsed reference into "[overlay@/home/gitlab-runner/.local/share/containers/storage+/tmp/containers-user-997/containers:overlay.imagestore=/var/lib/containers-shared,overlay.mount_program=/usr/bin/fuse-overlayfs]@7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9" 
DEBU[0000] Found image "busybox" as "docker.io/library/busybox:latest" in local containers storage 
DEBU[0000] Found image "busybox" as "docker.io/library/busybox:latest" in local containers storage ([overlay@/home/gitlab-runner/.local/share/containers/storage+/tmp/containers-user-997/containers:overlay.imagestore=/var/lib/containers-shared,overlay.mount_program=/usr/bin/fuse-overlayfs]@7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9) 
DEBU[0000] exporting opaque data as blob "sha256:7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9" 
DEBU[0000] Inspecting image 7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9 
DEBU[0000] exporting opaque data as blob "sha256:7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9" 
DEBU[0000] exporting opaque data as blob "sha256:7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9" 
DEBU[0000] Inspecting image 7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9 
DEBU[0000] Inspecting image 7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9 
DEBU[0000] Inspecting image 7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9 
DEBU[0000] Inspecting image 7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9 
DEBU[0000] using systemd mode: false                    
DEBU[0000] No hostname set; container's hostname will default to runtime default 
DEBU[0000] Loading seccomp profile from "/usr/share/containers/seccomp.json" 
DEBU[0000] Allocated lock 42 for container df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5 
DEBU[0000] parsed reference into "[overlay@/home/gitlab-runner/.local/share/containers/storage+/tmp/containers-user-997/containers:overlay.imagestore=/var/lib/containers-shared,overlay.mount_program=/usr/bin/fuse-overlayfs]@7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9" 
DEBU[0000] exporting opaque data as blob "sha256:7cfbbec8963d8f13e6c70416d6592e1cc10f47a348131290a55d43c3acab3fb9" 
DEBU[0000] Created container "df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5" 
DEBU[0000] Container "df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5" has work directory "/home/gitlab-runner/.local/share/containers/storage/overlay-containers/df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5/userdata" 
DEBU[0000] Container "df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5" has run directory "/tmp/containers-user-997/containers/overlay-containers/df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5/userdata" 
DEBU[0000] Handling terminal attach                     
INFO[0000] Received shutdown.Stop(), terminating!        PID=1402741
DEBU[0000] Enabling signal proxying                     
DEBU[0000] overlay: mount_data=lowerdir=/home/gitlab-runner/.local/share/containers/storage/overlay/l/HWX6ELZ2AB3KE3MFRLVLEAJATH,upperdir=/home/gitlab-runner/.local/share/containers/storage/overlay/9a687668b411b668aefeb25c274fa7a00b95e0b664ee4324c0e53ca4b9a33fc5/diff,workdir=/home/gitlab-runner/.local/share/containers/storage/overlay/9a687668b411b668aefeb25c274fa7a00b95e0b664ee4324c0e53ca4b9a33fc5/work,,volatile 
DEBU[0000] Mounted container "df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5" at "/home/gitlab-runner/.local/share/containers/storage/overlay/9a687668b411b668aefeb25c274fa7a00b95e0b664ee4324c0e53ca4b9a33fc5/merged" 
DEBU[0000] Created root filesystem for container df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5 at /home/gitlab-runner/.local/share/containers/storage/overlay/9a687668b411b668aefeb25c274fa7a00b95e0b664ee4324c0e53ca4b9a33fc5/merged 
DEBU[0000] Made network namespace at /run/user/997/netns/netns-8d656fed-589e-3897-5f02-f43b6f8b180d for container df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5 
DEBU[0000] slirp4netns command: /usr/bin/slirp4netns --disable-host-loopback --mtu=65520 --enable-sandbox --enable-seccomp --enable-ipv6 -c -e 3 -r 4 --netns-type=path /run/user/997/netns/netns-8d656fed-589e-3897-5f02-f43b6f8b180d tap0 
DEBU[0000] /etc/system-fips does not exist on host, not mounting FIPS mode subscription 
DEBU[0000] reading hooks from /usr/share/containers/oci/hooks.d 
DEBU[0000] Workdir "/" resolved to host path "/home/gitlab-runner/.local/share/containers/storage/overlay/9a687668b411b668aefeb25c274fa7a00b95e0b664ee4324c0e53ca4b9a33fc5/merged" 
DEBU[0000] Created OCI spec for container df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5 at /home/gitlab-runner/.local/share/containers/storage/overlay-containers/df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5/userdata/config.json 
DEBU[0000] /usr/bin/conmon messages will be logged to syslog 
DEBU[0000] running conmon: /usr/bin/conmon               args="[--api-version 1 -c df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5 -u df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5 -r /usr/bin/crun -b /home/gitlab-runner/.local/share/containers/storage/overlay-containers/df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5/userdata -p /tmp/containers-user-997/containers/overlay-containers/df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5/userdata/pidfile -n strange_clarke --exit-dir /tmp/run-997/libpod/tmp/exits --full-attach -l journald --log-level debug --syslog -t --conmon-pidfile /tmp/containers-user-997/containers/overlay-containers/df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/gitlab-runner/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /tmp/containers-user-997/containers --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /tmp/run-997/libpod/tmp --exit-command-arg --network-config-dir --exit-command-arg  --exit-command-arg --network-backend --exit-command-arg netavark --exit-command-arg --volumepath --exit-command-arg /home/gitlab-runner/.local/share/containers/storage/volumes --exit-command-arg --db-backend --exit-command-arg boltdb --exit-command-arg --transient-store=false --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.imagestore=/var/lib/containers-shared --exit-command-arg --storage-opt --exit-command-arg overlay.mount_program=/usr/bin/fuse-overlayfs --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg --syslog --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --rm --exit-command-arg df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5]"
INFO[0000] Failed to add conmon to cgroupfs sandbox cgroup: creating cgroup for blkio: mkdir /sys/fs/cgroup/blkio/conmon: permission denied 
DEBU[0000] Received: 1402819                            
INFO[0000] Got Conmon PID as 1402816                    
DEBU[0000] Created container df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5 in OCI runtime 
DEBU[0000] Attaching to container df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5 
DEBU[0000] Starting container df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5 with command [echo] 
DEBU[0000] Received a resize event: {Width:401 Height:119} 
DEBU[0000] Started container df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5 
DEBU[0000] Notify sent successfully                     

DEBU[0000] Checking if container df0b8d1049292f8625a15f11960e0964cbf98ed821e8a3f62dd86a4e224ccca5 should restart 
DEBU[0000] Called run.PersistentPostRunE(podman run --log-level=debug --rm -ti busybox echo) 
DEBU[0000] Shutting down engines                        
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] overlay: imagestore=/var/lib/containers-shared 
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs 
DEBU[0000] backingFs=extfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false 

My complete /etc/containers/storage.conf:

[storage]
driver = "overlay"
runroot = "/run/containers/storage"
graphroot = "/var/lib/containers/storage"

[storage.options]
additionalimagestores = [ "/var/lib/containers-shared" ]
pull_options = {enable_partial_images = "true", use_hard_links = "true", ostree_repos=""}

[storage.options.overlay]
mountopt = "nodev,metacopy=on"
mount_program = "/usr/bin/fuse-overlayfs"

[storage.options.thinpool]

Edit: Oh, I also forgot there's a per-user storage.conf for gitlab-runner:

[storage]
driver="overlay"
[storage.options]
additionalimagestores=["/var/lib/containers-shared"]
[storage.options.overlay]
mount_program="/usr/bin/fuse-overlayfs"
rhatdan commented 1 year ago

@giuseppe PTAL

giuseppe commented 1 year ago

what is the reason of using fuse-overlayfs for root containers?

phemmer commented 1 year ago

Mostly because it was an attempt to get the functionality (described in linked article) without having to configure it for every single user one by one, before discovering that /etc/containers/storage.conf is only used by root.

Are you saying this is the cause of the problem? If so, can you please explain, as the problem does not occur on containers launched by root, and in the example shown above, root is not used.

giuseppe commented 1 year ago

no, that is not the cause of the problem, fuse-overlays should work from root as well. The error is probably in the cleanup process, that doesn't trigger the unmount for the FUSE mount so the fuse-overlayfs process keeps running.

Another thing worth noticing, sudo -u gitlab-runner does not create the correct environment for running rootless containers, could you try creating a session with ssh or machinectl?

Since you are always able to reproduce, would it be possible for you to run podman --log-level debug container cleanup $CTR where $CTR is a container left running? Or there are no containers showing up as running but there is only the fuse-overlayfs mount? Do you have fusermount3 installed?

phemmer commented 1 year ago

could you try creating a session with ssh or machinectl?

Tried with machinectl shell gitlab-runner@.host. The issue still persists.

Or there are no containers showing up as running but there is only the fuse-overlayfs mount?

There are no containers running.

Do you have fusermount3 installed?

Yes

giuseppe commented 1 year ago

could you try creating a session with ssh or machinectl?

Tried with machinectl shell gitlab-runner@.host. The issue still persists.

does it exist with containers that you create from this session or older ones?