Closed eugene-bright closed 6 years ago
I agree. Multiple sources would make btrfs -> s3 too complicated, but could be a useful optimization for btrfs -> btrfs.
@AmesCornish, is S3 support crucial for you? For now buttersink is indispensable tool as it only works by direct ioctl manipulations (and thus may do things right). Due to buttersink's design limitations and growing technical debt I'm thinking about starting own project from the scratch. Currently I do not have time to such a drastic action. But I would like to know what were your core motivation and crucial features list when you started to work on buttersink.
For me S3 is #1, ssh syncs are #2, and the "original UUID" fix I wish were handled in btrfs itself. What are the key "design limitations" for you?
I'm not so smart to dig into messy code. So defining clear interfaces along with inverse of control and type annotations are must have for me. The weakest part of buttersink is serialization. It took me a day to pass one new attribute over SSH. It's also hard to debug server started over SSH. Now I currently can debug only client side part under ipdb. SSH is not a target for me at all as I have full control over my backup server installation. So I would like to use state-of-the-art well defined RPC protocols. Btrfs <-> btrfs is number one for me. So I do not need 75% of the current code base, especially snapshot base optimizations.
Could you tell me more about UUID fixing?
I've read the note from Butter.py
but still can't grasp it fully.
What does happen if patching is not performed?
When you use btrfs "send", it can avoid sending duplicate data only if the data is already present, in both the source and the destination, in a snapshot with the exact same original UUID. If the UUIDs are different in the source and destination referenced snapshots then the data chunks are resent. Specifically, if you try to receive a btrfs send and you don't have the exact required UUIDs present in the destination it will fail.
btrfs <--> btrfs should be improved with d25e71e
I'm personally interested in
btrfs <-> btrfs
transferring scenario. Asbtrfs-send
has an option-c <clone-src>
that allow unlimited number of snapshots to be used as data source for CoW. This option provide much more opportunities in diff size optimization and simplify algorithms as FS does the job itself as I see. Buttersink does not support concept of multiple sources for now and probably never will. But I write it here for further considerations.