canonical / multipass

Multipass orchestrates virtual Ubuntu instances
https://multipass.run
GNU General Public License v3.0
7.85k stars 650 forks source link

Unable to stop/shutdown primary instance #3370

Closed ricab closed 9 months ago

ricab commented 9 months ago

Just wondering why this issue is closed without a documented fix or workaround on MacOS? I am having this issue now. I only have three instances and only the primary is running and can't stop it.

==> /Library/Logs/Multipass/multipassd.log <==
[2024-01-12T06:03:30.648] [debug] [primary] QMP: {"return": {}}

[2024-01-12T06:03:30.650] [debug] [primary] QMP: {"timestamp": {"seconds": 1705057410, "microseconds": 647553}, "event": "POWERDOWN"}

[2024-01-12T06:03:30.650] [info] [primary] VM powering down
[2024-01-12T06:09:50.650] [debug] [base_vm] Error getting extra IP addresses: ssh connection failed: 'Timeout connecting to 192.168.64.6'
[2024-01-12T06:10:00.903] [info] [daemon] Cannot open ssh session on "primary" shutdown: ssh connection failed: 'Failed to connect: Host is down'
[2024-01-12T06:10:00.904] [debug] [primary] QMP: {"return": {}}
{"timestamp": {"seconds": 1705057800, "microseconds": 903987}, "event": "POWERDOWN"}

...then I tried to use the UI icon menu to exit multipass, but that only seems to exit the UI - I still have:

$ multipass list
Name                    State             IPv4             Image
primary                 Running           192.168.64.6     Ubuntu 20.04 LTS
ubuntu-2004             Stopped           --               Ubuntu 20.04 LTS
ubuntu-lts              Stopped           --               Ubuntu 20.04 LTS

Then I tried to shutdown the instance from within:

$ multipass shell primary
ubuntu@primary:~$ sudo -sH
root@primary:/home/ubuntu# sync;halt
(back out to the host shell)
$ multipass list
Name                    State             IPv4             Image
primary                 Running           192.168.64.6     Ubuntu 20.04 LTS
ubuntu-2004             Stopped           --               Ubuntu 20.04 LTS
ubuntu-lts              Stopped           --               Ubuntu 20.04 LTS

Still running! ...and then tried to get back in:

$ multipass shell primary
shell failed: ssh connection failed: 'Connection refused'

So halt just killed the instances' sshd without fully shutting down the instance.

So then I tried to stop/start the multipass daemon in a separate terminal:

sudo launchctl stop com.canonical.multipassd

...this caused the hanging multipass stop primary to output:

$ multipass stop primary
stop failed: cannot connect to the multipass socket

...then I restarted the service daemon:

sudo launchctl start com.canonical.multipassd

Then I checked the state:

$ multipass list
Name                    State             IPv4             Image
primary                 Running           192.168.64.6     Ubuntu 20.04 LTS
ubuntu-2004             Stopped           --               Ubuntu 20.04 LTS
ubuntu-lts              Stopped           --               Ubuntu 20.04 LTS

Lastly I tried rebooting but that instance IS STILL RUNNING? So now what?

Originally posted by @wolfch-elsevier in https://github.com/canonical/multipass/issues/2319#issuecomment-1888897818

ricab commented 9 months ago

Hi @wolfch-elsevier, let's follow up here.

When asked to stop an instance, Multipass first tries to ssh into it to execute a wall command, to warn any users that are logged in. It then asks the backend to shut it down.

In your case, Multipass was unable to ssh into the instance then. I am not sure why that would be, especially if you could later shell into it yourself. The POWERDOWN event shows that Multipass asked the instance to shutdown, but if it was stuck it may have been unable to shutdown cleanly. Note that shutting down can take a while if there are unresponsive internal services, because the system can wait for a few minutes before killing them.

Once you issued the halt command, it is expected that you couldn't connect to it. However, at that point Multipass would still see it as running until it finished shutting down completely.

When you restart the Multipass daemon, it tries to suspend running instances, to restore their running state when it comes back up. I'm not sure what could have happened with respect to suspension in your situation, but it is expected that running instances resume running state after restarts (including after machine reboots).

If you are still unable to shutdown the instance by regular means (i.e. multipass stop primary or sudo shutdown -h now inside the instance), you can kill the corresponding QEMU process. Something like sudo pkill -9 -f qemu-system.*primary should do it.

ricab commented 9 months ago

FWIW, we've been flirting with the idea of a force stop for a long time, but we always have other things to work on and we haven't been able to see it through yet.

ricab commented 9 months ago

@wolfch-elsevier, I see you posted on #2784 too and eventually overcame the issue, so closing this one.