Closed adam-koblentz closed 6 years ago
Can you upgrade both your LXD systems to the latest LXD? That should at the very least give you a better error message.
apt install -t zesty-backports lxd lxd-client
That assumes you have the backports pocket enabled on those systems. That will get you LXD 2.16.
@stgraber
Here's the output after upgrading both sides:
$ lxc copy serrano:actor-desktop actor-desktop
error: Failed container creation:
- https://172.16.79.3:8443: Error transferring container data: exit status 11
Hmm, ok, so that's not that much more useful is it :)
Can you look at /var/log/lxd/lxd.log on the source and target, see if there are any errors in there that would be a bit more useful than "exit status 11"?
@stgraber
I rebooted both machines after updating to the newest version before running this test and these are the log messages from just this command:
$ lxc copy serrano:actor-desktop actor-desktop
I verified that the local (minimal) has > 35GB of disk free, and I was able to copy two other containers before I started having this error. Also, just as a test, I tried running this both as root and my normal user (in the lxd group). Same results.
Here's the contents from the remote (serrano):
lvl=eror msg="Rsync send failed: /var/lib/lxd/containers/actor-desktop/: exit status 11: rsync: write failed on \"/var/lib/lxd/containers/actor-desktop/rootfs/usr/include/readline/tilde.h\": No space left on device (28)\nrsync error: error in file IO (code 11) at receiver.c(393) [receiver=3.1.2]\n" t=2017-08-10T14:31:57-0400
Here's the contents from the local (minimal):
ephemeral=false lvl=info msg="Creating container" name=actor-desktop t=2017-08-10T14:31:34-0400
ephemeral=false lvl=info msg="Created container" name=actor-desktop t=2017-08-10T14:31:34-0400
lvl=warn msg="Unable to update backup.yaml at this time." name=actor-desktop t=2017-08-10T14:31:34-0400
lvl=eror msg="Rsync receive failed: /var/lib/lxd/containers/actor-desktop/: exit status 11: " t=2017-08-10T14:31:57-0400
err="exit status 11" lvl=eror msg="Error during migration sink" t=2017-08-10T14:31:57-0400
created=2017-08-10T18:31:34+0000 ephemeral=false lvl=info msg="Deleting container" name=actor-desktop t=2017-08-10T14:31:57-0400 used=1970-01-01T00:00:00+0000
created=2017-08-10T18:31:34+0000 ephemeral=false lvl=info msg="Deleted container" name=actor-desktop t=2017-08-10T14:31:58-0400 used=1970-01-01T00:00:00+0000
Can you paste "df -h" and "df -i" from serrano?
Confusingly the "out of disk space" type errors also happen if you run out of inodes, not only if your run out of space.
My vmware config for minimal has a 60GB disk, I just verified in gparted that it's all allocated correctly. My vmware config for serrano has a 70GB disk, also verified in gparted.
Here's my local (the copying-to vm) named minimal:
$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 3.9G 0 3.9G 0% /dev
tmpfs 797M 12M 785M 2% /run
/dev/mapper/minimal--vg-root 12G 11G 724M 94% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
tmpfs 100K 0 100K 0% /var/lib/lxd/shmounts
tmpfs 100K 0 100K 0% /var/lib/lxd/devlxd
tmpfs 797M 0 797M 0% /run/user/1000
$df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
udev 1013604 458 1013146 1% /dev
tmpfs 1019275 1403 1017872 1% /run
/dev/mapper/minimal--vg-root 786432 264584 521848 34% /
tmpfs 1019275 1 1019274 1% /dev/shm
tmpfs 1019275 3 1019272 1% /run/lock
tmpfs 1019275 16 1019259 1% /sys/fs/cgroup
tmpfs 1019275 1 1019274 1% /var/lib/lxd/shmounts
tmpfs 1019275 2 1019273 1% /var/lib/lxd/devlxd
tmpfs 1019275 5 1019270 1% /run/user/1000
Here's my remote (copying-from vm) named serrano:
$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 3.9G 0 3.9G 0% /dev
tmpfs 797M 13M 784M 2% /run
/dev/sda1 69G 44G 22G 68% /
tmpfs 3.9G 12K 3.9G 1% /dev/shm
tmpfs 5.0M 4.0K 5.0M 1% /run/lock
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
tmpfs 100K 0 100K 0% /var/lib/lxd/shmounts
tmpfs 100K 0 100K 0% /var/lib/lxd/devlxd
tmpfs 797M 132K 797M 1% /run/user/1000
$ df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
udev 1013570 447 1013123 1% /dev
tmpfs 1019276 1493 1017783 1% /run
/dev/sda1 4587520 1107356 3480164 25% /
tmpfs 1019276 4 1019272 1% /dev/shm
tmpfs 1019276 6 1019270 1% /run/lock
tmpfs 1019276 16 1019260 1% /sys/fs/cgroup
tmpfs 1019276 1 1019275 1% /var/lib/lxd/shmounts
tmpfs 1019276 2 1019274 1% /var/lib/lxd/devlxd
tmpfs 1019276 73 1019203 1% /run/user/1000
Also here's output from fdisk on minimal:
$sudo fdisk -l
Disk /dev/sda: 60 GiB, 64424509440 bytes, 125829120 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x0e90b202
Device Boot Start End Sectors Size Id Type
/dev/sda1 * 2048 125829119 125827072 60G 8e Linux LVM
Disk /dev/mapper/minimal--vg-root: 12 GiB, 12880707584 bytes, 25157632 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/mapper/minimal--vg-swap_1: 8 GiB, 8589934592 bytes, 16777216 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Ok, so inode count and disk space looks good, not sure why rsync is running out of space then...
How long does it take before you hit the error during transfer? Do you see some progress information during the transfer? LXD 2.16 should show you how much data has been transferred.
I see about 1.29GB transfers before the error happens.
Can you run "du -sch /var/lib/lxd/containers/actor-desktop/" and then "du -sch --apparent-size /var/lib/lxd/containers/actor-desktop/"?
Checking for potential sparse files causing problems during rsync by taking their expanded size on the target, even if only temporarily.
Sure, here's the output:
root@serrano:~# du -sch --apparent-size /var/lib/lxd/containers/actor-desktop/
3.2G /var/lib/lxd/containers/actor-desktop/
3.2G total
root@serrano:~# du -sch /var/lib/lxd/containers/actor-desktop/
3.4G /var/lib/lxd/containers/actor-desktop/
3.4G total
So it looks like it's transferring around half before dying.
The container is based on a centos7 image with a few of my company's apps installed on it, so the size isn't surprising to me.
Yeah, that looks fine, so it's unlikely to be a sparse file issue.
Any chance you can watch the "df -h" and "df -i" output from the target host as you try to transfer the container? See if either gets dangerously close to running out just before the transfer fails?
Okay, it looks like /dev/mapper/minimal--vg-root
gets full and that's what kills it.
That hits 100% and then the copy fails.
I have my VM set to dynamically expand the drive on my real physical machine's disk as needed, which is fusion's default behavior.
But you're copying away from this system aren't you?
Copying to this system. Both are vms in my fusion env on my mac.
I can also run watch on the remote vm as well.
Ah, ok, so trying to copy a 3.5GB container to a system with just 724MB of free space then? yeah, that's not gonna work :)
Yeah, I'm just noticing that ubuntu's lvm didn't pick up my previous changes. I'll fix that and try again. Sorry!
Cool, that all makes sense :) Closing this issue for now, feel free to comment if you run into other problems.
Required information
Issue description
I have 2 zesty VMs running in vmware fusion on my macbook. I am trying to copy containers from one to the other. I followed the instructions here LXD 2.0: Remote hosts and container migration
And 2 of my containers copied fine. Subsequent containers are not.
I am getting this error message:
In a previous issue, I saw someone mentioned they got around another remote issue by fully qualifying the IP in the remote conf, which I also have done.
On both machines I have done these 2 commands:
and added the remote to the container-less vm
Here is the output of
lxc remote list
:and here is the truncated output of
lxc list serrano:
:My copy command:
lxc copy serrano:actor-desktop actor-desktop
and the output:error: Migration failed on source host: exit status 11