AmesCornish / buttersink

Buttersink is like rsync for btrfs snapshots
GNU General Public License v3.0
195 stars 18 forks source link

Transfer size indication is broken #26

Closed a3nm closed 8 years ago

a3nm commented 8 years ago

I have invoked buttersink to transfer snapshots from one filesystem to the other as follows:

$ sudo ./buttersink.py /mnt/mem/snapshots/ /mnt/fah/BACKUP/snapshots/mem/

The current partial output is:

  Optimal synchronization:
  2.4 TiB from 2 diffs in btrfs /mnt/fah/BACKUP/snapshots/mem
  1.082 TiB from 374 diffs in btrfs /mnt/mem/snapshots
  3.482 TiB from 376 diffs in TOTAL
  Keep: aeda...1b4d /mnt/fah/BACKUP/snapshots/mem/btrfs-mem-backup-1459375237 from None (1.2 TiB)
  Keep: fcf5...a56e /mnt/fah/BACKUP/snapshots/mem/btrfs-mem-backup-1459374875 from None (1.2 TiB)
  Xfer: 7b69...2383 /mnt/mem/snapshots/snapshot-1460156461 from aeda...1b4d /mnt/mem/snapshots/btrfs-mem-backup-1459375237 (~439.2 MiB)
 0:13:42.781178: Sent 5.524 GiB of 439.2 MiB (1287%) ETA: None (57.7 Mbps )                           
  Xfer: 419b...5bcf /mnt/mem/snapshots/snapshot-1459512061 from aeda...1b4d /mnt/mem/snapshots/btrfs-mem-backup-1459375237 (~107.4 MiB)
 0:04:09.215211: Sent 5.139 GiB of 107.4 MiB (4901%) ETA: None (177 Mbps )                            
  Xfer: e897...4e4a /mnt/mem/snapshots/snapshot-1460250061 from aeda...1b4d /mnt/mem/snapshots/btrfs-mem-backup-1459375237 (~4.398 GiB)

The percentages end up being larger than 100%, so something is wrong somewhere. (The percentages get incremented and go over 100% while progress information is being displayed.)

(I should also point out that buttersink's heuristic do not seem to be performing so well in this case. The snapshots here were created by a crontab, with diff sizes much smaller than the total volume size, so I think the optimal plan would be to transfer the diffs successively. However, buttersink is apparently trying to transfer the delta with a much older state of the data because it already exists on the destination filesystem.)

AmesCornish commented 8 years ago

Yes, the percentages can be arbitrarily larger than 100%. Buttersink used heuristics to guess at the source diff size, and sometimes (often?) this is incorrect. If you are transferring remotely (e.g. over ssh), buttersink will perform a local diff ahead of time, to get a correct size before determining the optimal transfers. If you are transferring locally, it just uses the guesses, since it would take just as long to measure the actual source size as it does to just do the transfer.

One tip: you can force buttersink to do the pre-measuring if you use ssh:/localhost/path as your destination.

a3nm commented 8 years ago

Thanks for the explanations! I understand better now, but still I wonder whether the display couldn't be improved so that this is less confusing and does not look like a bug to the user. For instance, display "of estimated" instead of "of" when using estimated sizes, and drop ETA (and maybe the percentage itself) when percentage is above 100%?

Regarding the parenthesis at the end of my report, do you have any ideas about why buttersink is not picking the right parents at all? Is this related to the diff estimations being off? (I can open a different bug with this same report if you like.)

Thanks again!

Antoine Amarilli

On Mon, May 09, 2016 at 11:29:38AM -0700, Ames wrote:

Yes, the percentages can be arbitrarily larger than 100%. Buttersink used heuristics to guess at the source diff size, and sometimes (often?) this is incorrect. If you are transferring remotely (e.g. over ssh), buttersink will perform a local diff ahead of time, to get a correct size before determining the optimal transfers. If you are transferring locally, it just uses the guesses, since it would take just as long to measure the actual source size as it does to just do the transfer.

One tip: you can force buttersink to do the pre-measuring if you use ssh:/localhost/path as your destination.


You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/AmesCornish/buttersink/issues/26#issuecomment-217948349