lima-vm / lima

Linux virtual machines, with a focus on running containers
https://lima-vm.io/
Apache License 2.0
15.4k stars 604 forks source link

Mount points do not resume after stop/start/reboot/etc of podman instance #663

Open ddl-mmercer opened 2 years ago

ddl-mmercer commented 2 years ago

Description

When using limactl with a podman instance and fuse sshfs for volume mounts, the podman instance will come up successfully just fine, but on subsequent attempts, it fails to mount the sshfs systems:

limactl stop podman
INFO[0000] Sending SIGINT to hostagent process 15141
INFO[0000] Waiting for the host agent and the qemu processes to shut down
INFO[0000] [hostagent] fusermount: mount failed: Operation not permitted
INFO[0000] [hostagent] fusermount: mount failed: Operation not permitted
INFO[0000] [hostagent] Received SIGINT, shutting down the host agent
INFO[0000] [hostagent] Shutting down the host agent
INFO[0000] [hostagent] Stopping forwarding "/run/user/502/podman/podman.sock" (guest) to "/Users/marcmercer/.lima/podman/sock/podman.sock" (host)
INFO[0000] [hostagent] Stopping forwarding "/run/lima-guestagent.sock" (guest) to "/Users/marcmercer/.lima/podman/ga.sock" (host)
INFO[0000] [hostagent] Unmounting "/Users/marcmercer"
INFO[0000] [hostagent] Unmounting "/tmp/lima"
INFO[0000] [hostagent] Shutting down QEMU with ACPI
INFO[0000] [hostagent] Sending QMP system_powerdown command
INFO[0001] [hostagent] QEMU has exited
limactl start podman
INFO[0000] Using the existing instance "podman"
INFO[0000] [hostagent] Starting QEMU (hint: to watch the boot progress, see "/Users/marcmercer/.lima/podman/serial.log")
INFO[0000] SSH Local Port: 53492
INFO[0000] [hostagent] Waiting for the essential requirement 1 of 5: "ssh"
INFO[0019] [hostagent] The essential requirement 1 of 5 is satisfied
INFO[0019] [hostagent] Waiting for the essential requirement 2 of 5: "user session is ready for ssh"
INFO[0030] [hostagent] Waiting for the essential requirement 2 of 5: "user session is ready for ssh"
INFO[0031] [hostagent] The essential requirement 2 of 5 is satisfied
INFO[0031] [hostagent] Waiting for the essential requirement 3 of 5: "sshfs binary to be installed"
INFO[0031] [hostagent] The essential requirement 3 of 5 is satisfied
INFO[0031] [hostagent] Waiting for the essential requirement 4 of 5: "/etc/fuse.conf to contain \"user_allow_other\""
INFO[0031] [hostagent] The essential requirement 4 of 5 is satisfied
INFO[0031] [hostagent] Waiting for the essential requirement 5 of 5: "the guest agent to be running"
INFO[0031] [hostagent] The essential requirement 5 of 5 is satisfied
INFO[0031] [hostagent] Mounting "/Users/marcmercer"
INFO[0031] [hostagent] fusermount: mount failed: Operation not permitted
WARN[0063] [hostagent] failed to confirm whether /Users/marcmercer [remote] is successfully mounted
INFO[0063] [hostagent] Mounting "/tmp/lima"
INFO[0063] [hostagent] fusermount: mount failed: Operation not permitted
WARN[0094] [hostagent] failed to confirm whether /tmp/lima [remote] is successfully mounted
INFO[0094] [hostagent] Waiting for the optional requirement 1 of 1: "user probe 1/1"
INFO[0094] [hostagent] Forwarding "/run/user/502/podman/podman.sock" (guest) to "/Users/marcmercer/.lima/podman/sock/podman.sock" (host)
INFO[0094] [hostagent] The optional requirement 1 of 1 is satisfied
INFO[0094] [hostagent] Forwarding "/run/lima-guestagent.sock" (guest) to "/Users/marcmercer/.lima/podman/ga.sock" (host)
INFO[0094] [hostagent] Waiting for the final requirement 1 of 1: "boot scripts must have finished"
INFO[0095] [hostagent] Not forwarding TCP 127.0.0.53:53
INFO[0095] [hostagent] Not forwarding TCP 0.0.0.0:22
INFO[0095] [hostagent] Not forwarding TCP [::]:22
INFO[0095] [hostagent] The final requirement 1 of 1 is satisfied
INFO[0095] READY. Run `limactl shell podman` to open the shell.
INFO[0095] Message from the instance "podman":
To run `podman` on the host (assumes podman-remote is installed):
$ export CONTAINER_HOST=unix:///Users/marcmercer/.lima/podman/sock/podman.sock

So far, the workaround is to delete and recreate the podman instance, forfeiting your currently built up infrastructure (images, containers, etc)

This is is SIMILAR to #584but not necessarily identical.

afbjorklund commented 2 years ago

Looks like fusermount3 was upgraded by some buggy process, that failed to make it setuid.

That is why you are getting "Operation not permitted", since it is not running as user "root"

Healthy:

lrwxrwxrwx 1 root root    11 Jun 20  2021 /usr/bin/fusermount -> fusermount3
-rwsr-xr-x 1 root root 35200 Jun 20  2021 /usr/bin/fusermount3

Broken:

lrwxrwxrwx 1 root root    11 Jun 20  2021 /usr/bin/fusermount -> fusermount3
-rwxr-xr-x 1 root root 35200 Jun 20  2021 /usr/bin/fusermount3

There is also a "/etc/fuse.conf.dpkg-new" file, so maybe it was dpkg ?

The package "fuse3" is not part of the image, but installed on first boot.

Workaround (fixing the symptom):

sudo chmod u+s /bin/fusermount3

afbjorklund commented 2 years ago

This is is SIMILAR to #584but not necessarily identical.

The other issue is related to docker starting before sshfs.

afbjorklund commented 2 years ago

From /var/log/cloud-init-output.log:

Setting up fuse3 (3.10.3-2) ...

Configuration file '/etc/fuse.conf'
 ==> Modified (by you or by a script) since installation.
 ==> Package distributor has shipped an updated version.
   What would you like to do about it ?  Your options are:
    Y or I  : install the package maintainer's version
    N or O  : keep your currently-installed version
      D     : show the differences between the versions
      Z     : start a shell to examine the situation
 The default action is to keep your current version.
*** fuse.conf (Y/I/N/O/D/Z) [default=N] ? dpkg: error processing package fuse3 (--configure):
 end of file on stdin at conffile prompt
Setting up libavahi-glib1:amd64 (0.8-5ubuntu4) ...
dpkg: dependency problems prevent configuration of fuse-overlayfs:
 fuse-overlayfs depends on fuse3; however:
  Package fuse3 is not configured yet.

dpkg: error processing package fuse-overlayfs (--configure):
 dependency problems - leaving unconfigured

The configure was supposed to set up the ownership too:

    configure)
        if [ -c /dev/cuse ] && ! chrooted
        then
            chmod 0600 /dev/cuse > /dev/null 2>&1
        fi
        if ! dpkg-statoverride --list /bin/fusermount3 > /dev/null 2>&1
        then
            chmod 4755 /bin/fusermount3
        fi

        modprobe fuse > /dev/null 2>&1 || true

        if [ -x "`which update-initramfs 2>/dev/null`" ]
        then
            update-initramfs -u
        fi
        ;;

There is also some conflict, between "sshfs" and "fuse-overlayfs":

Unpacking libfuse3-3:amd64 (3.10.3-2) ...
dpkg: fuse: dependency problems, but removing anyway as you requested:
 sshfs depends on fuse.
 ntfs-3g depends on fuse.
ddl-mmercer commented 2 years ago

The workaround above does at least appear to work as a solution. Thanks @afbjorklund ;; creating a separate issue for the next observed issue.

afbjorklund commented 2 years ago

I think the "default" image has the same issue now, after some recent upgrade.