canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.38k stars 930 forks source link

lxc copy --refresh deletes latest snapshot and resends it even when trees are already in sync. ZFS storage backend #14472

Open manfromafar opened 19 hours ago

manfromafar commented 19 hours ago

Required information

Issue description

When sending a copy of a container to another host using lxc copy --refresh using a zfs backend on both systems top level snapshot is deleted on the remote and resent even if the snapshots are identical.

This can cause large amounts of unnecessary network traffic, disk io, etc.

Some steps to help resolve this. Would be before sending any snapshots checking the zfs guid property for the snapshots on both hosts and comparing. Since the zfs guids for the snapshots can't change as they are read only. If both parties leading snapshot match then no operations are needed. This might have to be a new flag to specify you only want to check snapshot consistency instead of the live data as well. Something like --snaps-only or a better name.

Steps to reproduce

  1. create a container (c1) on one system (sys1)
  2. snapshot c1 a couple of times with names like 1,2 a. lxc snapshot c1 1 b. lxc snapshot c1 2
  3. Write arbitrary data into c1. This is done to be able to see the transfer a. lxc exec c1 -- dd if=/dev/urandom of=/root/dd.img bs=1M count=1000
  4. Snapshot c1 as snapshot 3 a. lxc snapshot c1 3
  5. Copy c1 to sys2 (this assumes you ahve the remotes setup) a. lxc copy c1 sys2:c1
  6. Very the snapshots exist sys2 a. lxc info c1 should show snapshots 1,2,3 b. zfs list -t snapshot should show the snapshots for c1 (assumes your pool isn't managed by lxd
  7. Now that the trees are synced do a lxc copy --refresh from sys1 to sys2 for c1 a. lxc copy --refresh c1 sys2:c1
  8. If you monitor zfs on sys2 you'll notice that:
    1. snap 3 is deleted off sys2
    2. sys1 zfs sends snap3 back to sys2

Information to attach

tomponline commented 19 hours ago

Please may you update your reproducer steps with the exact lxc snapshot command you are using.

Also please can you confirm this issue isn't fix in latest/edge channel? Thanks

manfromafar commented 19 hours ago

OK, just tried it with edge lxd git-ba31c87 31205 latest/edge canonical✓ - and same behavior occurs. Snap 3 is deleted then resent

Updated the reproduction steps to have the snapshot commands. It's using the plan lxc snapshot commands.