Open jsimpso opened 2 years ago
So the fact that LXD attempts the migration is normal. We never try to guess the disk space and instead just go ahead and if we're going to hit ENOSPC, then that's going to be our cue that there isn't enough space.
The reason for that is that different storage backends store things very very differently. You can easily do cat /dev/null > out.img
or the like on a VM stored on a dir
storage backend, make it go to 100GiB, then move it to a project that uses a zfs storage pool and ZFS will happily store that in something like 500MiB thanks to the default inline compression.
Now, these days a project move shouldn't actually cause any data to be copied, so we'd need to check on current LXD to see what's going on there. It's also obviously problematic if QEMU just crashes or stops responding in such a situation when it itself isn't supposed to be writing anything to disk.
Yeah seems like the target project flag is confusing the same pool move detection somehow.
Right, so I've taken a look at whats going on here, and the behaviour is that when using instances on the dir
pool the storage subsystem invokes CreateInstanceFromCopy
, this then detects that the source and target are in the same pool and switches to "same-pool mode" which then invokes CreateVolumeFromCopy
on the storage driver.
In the case of a storage driver that supports optimized copies (snapshots) this is very quick and doesn't take much disk space as the copy is just a snapshot of the source instance (and then in the case of ZFS, the source instance volumes are kept around but hidden as they are still referenced by the new instance's name).
In the case of the dir
storage driver however, CreateVolumeFromCopy
invokes genericVFSCopyVolume
which uses the dd
command to copy the VM's root disk device.
Basically the problem here is that there isn't a "move" concept in the storage subsystem (https://github.com/lxc/lxd/blob/632b3839b3388cc89df0ffe96d7c7c9134f00a7b/lxd/storage/pool_interface.go#L39-L145) and so the lxc move
is just a lxc copy
followed by a lxc delete
(effectively).
We can see this where its implemented in the server side:
So I'm going to put this down as a feature & maybe, as its not a bug per-se.
Required information
Issue description
In a circumstance where the size of a VM's disk is larger than the amount of available space in the LXD storage backend, trying to move that VM between projects results in LXD trying to make a full copy of the VM and failing:
If other VMs are attempting to write to that storage backend when it fills up, they appear to stall, and LXD loses communication with them. Once the qemu process has been killed, LXD can re-launch the VM as expected.
Steps to reproduce
Have two projects present so that we have something to move between
Having a storage backend with relatively little storage will make for easier testing, I tested with a 120GB disk.
Initialise a few VMs, give one a bigger disk:
Grow the VM disk:
You should now have a VM whose disk is larger than the amount of available store in the backend:
Open a connection to each running VM (
lxd exec <vm> bash
)On two VMs, start slowly writing to disk. Then, move the large VM between projects:
During the move, the two VMs attempting to write to disk fully unresponsive, but the last one continues to respond to input.
We can then see that the qemu process is still running, but LXD has lost contact with it:
Killing the process returns the console:
And the VM can be started up again: