canonical / multipass

Multipass orchestrates virtual Ubuntu instances
https://multipass.run
GNU General Public License v3.0
7.63k stars 635 forks source link

`multipass launch snapcraft:devel` fails then causes other commands to hang #3401

Closed mr-cal closed 6 months ago

mr-cal commented 6 months ago

Describe the bug

multipass launch snapcraft:devel downloads the image but fails to launch after 6 minutes. It leaves the VM process running in the background. It then causes commands like multipass list to hang for 2 minutes and then fail.

It also causes snap remove multipass --purge to hang while waiting for those VMs in the background to stop.

To Reproduce

Myself and @cmatsuoka have both been able to reproduce this.

> sudo snap install multipass --channel=latest/beta 
multipass (beta) 1.13.1-rc.5+g55d5eed5 from Canonical✓ installed

> multipass launch snapcraft:devel
launch failed: The following errors occurred:
entranced-pika: timed out waiting for response

> multipass list
list failed: Could not determine IP address within 120000ms

Expected behavior Multipass to launch the image (or at least fail more cleanly).

Logs

The output of

> sudo snap install multipass --channel=latest/beta 
> multipass launch snapcraft:devel -vvvv
> multipass list -vvvv
> ps aux | rg multipass

can be found here: https://paste.ubuntu.com/p/TzRfwThKC3/

The output of

journalctl --unit 'snap.multipass*' --since "2024-02-08 11:22:00"

can be found here: https://paste.ubuntu.com/p/D74v8sHQbs/

Additional info

> multipass version  # I also repro'd this on 1.13.0
multipass   1.13.1-rc.5+g55d5eed5
multipassd  1.13.1-rc.5+g55d5eed5
> multipass info --all
Warning: the `--all` flag for the `info` command is deprecated. Please use `info` with no positional arguments for the same effect.
Name:           entranced-pika
State:          Unknown
Snapshots:      0
IPv4:           --
Release:        --
Image hash:     bdf3cc7ee924 (Ubuntu 24.04 LTS)
CPU(s):         --
Load:           --
Disk usage:     --
Memory usage:   --
Mounts:         --
> multipass get local.driver
qemu

Additional context Add any other context about the problem here.

townsend2010 commented 6 months ago

Hi @mr-cal!

Looks to be something in the noble buildd image is broken causing the instance to not boot. With this, you are then hitting a few bugs in Multipass such as the lack of asynchronicity (which will mostly be addressed this cycle) and then snapd waiting on qemu processes to die when trying to remove the snap.

I will have to dig a little into the exact symptom of the instance not booting and report this back to CPC.

mr-cal commented 6 months ago

That makes sense, thanks for the quick response!

We're currently using the daily noble image so this isn't a blocker, but Snapcraft using Multipass + buildd noble images will become critical as we get closer to the 24.04 beta.

townsend2010 commented 6 months ago

Sure, I'm making a test Multipass build where I can watch the qemu console and try to figure out where the boot process is failing and then report that to the CPC folks.

townsend2010 commented 6 months ago

For posterity's sake, I've made CPC aware of this issue and provided the following screenshot of what is happening when qemu tries booting the image. Screenshot from 2024-02-08 15-50-10 QEMU stays stuck on that screen.

townsend2010 commented 6 months ago

The cause of this issue appears to be https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/2048953.

Scratch that, CPC is now saying something else is likely the cause.

townsend2010 commented 6 months ago

Hi @mr-cal!

CPC has said they have fixed this and I have confirmed this by successfully launching a snapcraft:devel image.