canonical / multipass

Multipass orchestrates virtual Ubuntu instances
https://multipass.run
GNU General Public License v3.0
7.63k stars 635 forks source link

[regression] Unhandled exception when converting qcow2 images for snapshot support #3399

Closed kyanny closed 6 months ago

kyanny commented 6 months ago

Describe the bug

multipass command stopped working. (was working before)

To Reproduce

❯ multipass list
list failed: cannot connect to the multipass socket

❯ multipass info
info failed: cannot connect to the multipass socket

Expected behavior

multipass list displays list of VM names, and so on.

Logs Please provide logs from the daemon, see accessing logs on where to find them on your platform.

multipassd.log

Additional info

Additional context

I installed Multipass via Homebrew. I have uninstalled/reinstalled Multipass but the problem still persists.

townsend2010 commented 6 months ago

Hi @kyanny!

Sorry you are having this issue. Looking at the log you provided, it seems there is an issue with your primary instance image and we aren't handling this correctly. This was added in the 1.13 release, so I'm marking this a regression and it will be fixed in 1.13.1.

townsend2010 commented 6 months ago

Hi @kyanny,

Could you please confirm that the primary instance existed before you upgrade to 1.13.0 and then after the upgrade, this stopped working? Thank you!

ricab commented 6 months ago

Hi @kyanny, could you please paste the output of the following command?

qemu-img info "/var/root/Library/Application Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img"

If that also fails, do you remember how large the instance was (more or less)?

kyanny commented 6 months ago

Hi @townsend2010 thank you for your response.

Could you please confirm that the primary instance existed before you upgrade to 1.13.0 and then after the upgrade, this stopped working? Thank you!

Yes, if I remember correctly, I upgraded multipass after creating primary instance. With regard to the previous version I was using before the upgrade, I have no records.

kyanny commented 6 months ago

Hi @ricab thank you for your guidance. Here is the result of the command.

❯ qemu-img info "/var/root/Library/Application Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img"
zsh: command not found: qemu-img
❯ which qemu-img
qemu-img not found

I am not sure if it (qemu-img command does not exist) indicates my computer having problems that are irrelevant to multipass.

If that also fails, do you remember how large the instance was (more or less)?

I'm afraid I don't remember the instance size, but I believe it should be somewhat normal, because I did not create/download huge files in the instance and I did not install tons of software to the instance.

townsend2010 commented 6 months ago

Hi @kyanny!

Sorry for the wrong instructions for qemu-img. It should be:

/Library/Application\ Support/com.canonical.multipass/bin/qemu-img info "/var/root/Library/Application Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img"
kyanny commented 6 months ago

Hi @townsend2010 Thank you for the instruction. Here is the result of command.

❯ /Library/Application\ Support/com.canonical.multipass/bin/qemu-img info "/var/root/Library/Application Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img"
qemu-img: Could not open '/var/root/Library/Application Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img': Could not open '/var/root/Library/Application Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img': Permission denied

So I ran it with sudo:

❯ sudo /Library/Application\ Support/com.canonical.multipass/bin/qemu-img info "/var/root/Library/Application Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img"
Password:
image: /var/root/Library/Application Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img
file format: qcow2
virtual size: 5 GiB (5368709120 bytes)
disk size: 4.72 GiB
cluster_size: 65536
cleanly shut down: no
Snapshot list:
ID        TAG               VM SIZE                DATE     VM CLOCK     ICOUNT
1         suspend           969 MiB 2023-11-18 13:29:22695:50:02.829
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: true
    refcount bits: 16
    corrupt: false
    extended l2: false
Child node '/file':
    filename: /var/root/Library/Application Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img
    protocol type: file
    file length: 10 PiB (11259003204337664 bytes)
    disk size: 4.72 GiB

ls command displays the file size 10 petabyte, that is impossible.

❯ sudo ls -l /var/root/Library/Application\ Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img
-rw-r--r--  1 root  wheel  11259003204337664 11 18 14:07 /var/root/Library/Application Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img

❯ sudo ls -lah /var/root/Library/Application\ Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img
-rw-r--r--  1 root  wheel    10P 11 18 14:07 /var/root/Library/Application Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img

Note that my computer (MacBook Pro) has 1 terabyte storage, not petabyte.

❯ df -h
Filesystem        Size    Used   Avail Capacity iused ifree %iused  Mounted on
/dev/disk1s1s1   932Gi   9.4Gi   454Gi     3%    393k  4.3G    0%   /
devfs            341Ki   341Ki     0Bi   100%    1.2k     0  100%   /dev
/dev/disk1s3     932Gi   2.0Gi   454Gi     1%    4.2k  4.8G    0%   /System/Volumes/Preboot
/dev/disk1s5     932Gi   1.0Gi   454Gi     1%       1  4.8G    0%   /System/Volumes/VM
/dev/disk1s6     932Gi    12Mi   454Gi     1%      19  4.8G    0%   /System/Volumes/Update
/dev/disk1s2     932Gi   464Gi   454Gi    51%    4.7M  4.8G    0%   /System/Volumes/Data
map auto_home      0Bi     0Bi     0Bi   100%       0     0     -   /System/Volumes/Data/home

❯ diskutil list
/dev/disk0 (internal, physical):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *1.0 TB     disk0
   1:                        EFI EFI                     314.6 MB   disk0s1
   2:                 Apple_APFS Container disk1         1.0 TB     disk0s2

/dev/disk1 (synthesized):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      APFS Container Scheme -                      +1.0 TB     disk1
                                 Physical Store disk0s2
   1:                APFS Volume Macintosh HD            10.1 GB    disk1s1
   2:              APFS Snapshot com.apple.os.update-... 10.1 GB    disk1s1s1
   3:                APFS Volume Macintosh HD - Data     498.5 GB   disk1s2
   4:                APFS Volume Preboot                 2.1 GB     disk1s3
   5:                APFS Volume Recovery                1.2 GB     disk1s4
   6:                APFS Volume VM                      1.1 GB     disk1s5
ricab commented 6 months ago

Hi @kyanny, thank you for the info. I have no idea how your filesystem got tricked into seeing that file as PB-sized, but I have implemented handling for the command that was failing such that Multipass logs an error but is at least able to run. Could you please give it a try and let us know how it goes?

I am afraid that instance is most likely lost though. If this ever happens again, or if you remember anything that could explain that file corruption, please let us know.

kyanny commented 6 months ago

Hi @ricab

It worked. 👍

❯ multipass version
multipass   1.14.0-dev.1694.pr3400+g6b9795cae.mac
multipassd  1.14.0-dev.1694.pr3400+g6b9795cae.mac

~/workspaces/zd-2574241
❯ multipass list
Name                    State             IPv4             Image
primary                 Suspended         --               Ubuntu 22.04 LTS
better-conger           Stopped           --               Ubuntu 22.04 LTS

Other commands such as launch, shell is working, too. I could create a new instance.

kyanny commented 6 months ago

I could delete primary instance, too.

~/workspaces/zd-2574241
❯ multipass delete primary

~/workspaces/zd-2574241
❯ multipass ls
Name                    State             IPv4             Image
primary                 Deleted           --               Ubuntu 22.04 LTS
better-conger           Stopped           --               Ubuntu 22.04 LTS
reliable-bandicoot      Running           192.168.108.3    Ubuntu 22.04 LTS

~/workspaces/zd-2574241
❯ multipass purge

~/workspaces/zd-2574241
❯ multipass ls
Name                    State             IPv4             Image
better-conger           Stopped           --               Ubuntu 22.04 LTS
reliable-bandicoot      Running           192.168.108.3    Ubuntu 22.04 LTS

~/workspaces/zd-2574241
❯ sudo ls -l /var/root/Library/Application\ Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img
Password:
ls: /var/root/Library/Application Support/multipassd/qemu/vault/instances/primary/ubuntu-22.04-server-cloudimg-amd64.img: No such file or directory

~/workspaces/zd-2574241
❯ sudo ls -l /var/root/Library/Application\ Support/multipassd/qemu/vault/instances/primary/
ls: /var/root/Library/Application Support/multipassd/qemu/vault/instances/primary/: No such file or directory

~/workspaces/zd-2574241
❯ sudo ls -l /var/root/Library/Application\ Support/multipassd/qemu/vault/instances/
total 0
drwxr-xr-x  4 root  wheel  128  2  1 17:39 better-conger
drwxr-xr-x  4 root  wheel  128  2  8 12:52 reliable-bandicoot
ricab commented 6 months ago

Great, thanks for letting us know @kyanny! I will close this issue now, but if you have any insight into why the image file was seen as 10 PB, please let us know and reopen if necessary.