Poor heuristic choices for sending incremental diffs

crusaderky commented 6 years ago

I have a read-write "current" subvolume, from which I create read-only snapshots every day. If I didn't do anything significant on a given day, btrfs send <today> -p <yesterday> will produce a stream worth kilobytes.

buttersink doesn't seem to realise this, and tries to do extremely expensive transfers from a much older snapshot.

To another btrfs hard disk:

# buttersink   -n /btrfs/crusaderky/ /mnt/ext_hdd/crusaderky/
  Waiting for btrfs quota usage scan...
  Optimal synchronization:
  36.92 GiB from 3 diffs in btrfs /btrfs/crusaderky
  452.5 GiB from 1 diffs in btrfs /mnt/ext_hdd/crusaderky
  489.4 GiB from 4 diffs in TOTAL
  Keep: ca37...2c4b /mnt/ext_hdd/crusaderky/20170902-130702 from None (452.5 GiB)
  WOULD: Xfer: f970...da46 /btrfs/crusaderky/20171218-223900 from ca37...2c4b /btrfs/crusaderky/20170902-130702 (9.201 GiB)
  WOULD: Xfer: de7c...accf /btrfs/crusaderky/20180104-000001 from ca37...2c4b /btrfs/crusaderky/20170902-130702 (13.86 GiB)
  WOULD: Xfer: 5d72...eef2 /btrfs/crusaderky/20180103-013041 from ca37...2c4b /btrfs/crusaderky/20170902-130702 (13.86 GiB)

To s3:

# buttersink   -n /btrfs/crusaderky/ s3://crusaderky-buttersink/crusaderky/
  Listing S3 Bucket "crusaderky-buttersink" contents...
  measured size (27.72 GiB), estimated size (27.72 GiB)
  Optimal synchronization:
  462.9 GiB from 2 diffs in S3 Bucket "crusaderky-buttersink"
  27.72 GiB from 2 diffs in btrfs /btrfs/crusaderky
  490.7 GiB from 4 diffs in TOTAL
  Keep: ca37...2c4b /crusaderky/20170902-130702 from None (453.7 GiB)
  Keep: f970...da46 /crusaderky/20171218-223900 from ca37...2c4b /crusaderky/20170902-130702 (9.201 GiB)
  WOULD: Xfer: de7c...accf /btrfs/crusaderky/20180104-000001 from ca37...2c4b /btrfs/crusaderky/20170902-130702 (13.86 GiB)
  WOULD: Xfer: 5d72...eef2 /btrfs/crusaderky/20180103-013041 from ca37...2c4b /btrfs/crusaderky/20170902-130702 (13.86 GiB)

In the above situation,

the send of 20180103-013041 should use 20171218-223900 as a parent (4.7 GB) and not 20170902-130702 (13.86 GB).
the send of 20180104-000001 should use 20180103-013041 as a parent (< 1 MB) and not 20170902-130702 (13.86 GB).

RandomReaper commented 6 years ago

:+1: Same problem here.

AmesCornish commented 6 years ago

It's a bit hard to diagnose this without having the snapshots. Note that the heuristic considers factors other than the diff size, including how "tall" the diff stack is on the destination. i.e., it won't create a thousand one-day diffs each depending on the previous, because if any one of those thousand goes bad, you lose the whole thing. Buttersink is designed to occasionally diff from an "old" snapshot, so that your diff repo is more reliable.

In any event, I can see that it would at least be helpful to make the heuristic process more transparent, and maybe give some options for tweaking it. I'll leave this bug open to address that.

RandomReaper commented 6 years ago

Don't you think sending the snapshots in the order they are taken should be sufficient? I mean, if my source disk has enough space for storing all snapshots, a destination disk of the same size will suffice for storing them, and this is clearly not the case using the current algorithm.

AmesCornish commented 6 years ago

Indeed. My comment should only be relevant when S3 is the destination. I'll investigate further.

eugene-bright commented 6 years ago

Base for diffs should be updated ones any snapshot transfer is finished.

eugene-bright commented 6 years ago

My extra note on optimizations #58.

AmesCornish commented 5 years ago

The case of transferring into a btrfs system should be addressed in d25e71e. "Tall" diff chains will only be avoided for S3, which is storing diffs.

AmesCornish / buttersink

Poor heuristic choices for sending incremental diffs #50