koverstreet / bcachefs-tools

http://bcachefs.org
GNU General Public License v2.0
123 stars 87 forks source link

Consider implement `bachefs backup` #50

Open rzerres opened 3 years ago

rzerres commented 3 years ago

Ratio / feature request

Since Snapshots hit the ground of bcachfs this issue arises quite naturaly form the point of administration of a bcachefs driven production system.

Imaging the common szenario:

bcachefs right now offers all needed building blocks, missing conceptually the backup strategy.

A reference

Coming from a btrfs environment, I personaly use dsnap-sync. This is a simple posix-shell implemented tool, that is consuming btrfs-send/btrfs-receive to transfer snapshots and snapper to handle the needed snapshot. snapper is a c++ service/library, that organizes snaphosts building attributed configs.

snapper

The key feature of snapper is its ability to offer / define config specific timer and cleanup rules. This rules take care to manage config applicable backup rules (e.g: create hourly sanpshots during the day, keep a snaphost per day (max 6), keep a snaphost per week (max 3), keep a snapshot per month (max 11), keep a snapshot per year (max 2). snapper itself is using btrfs ro-snapshots as it target objects. It will transfer incremental data differences using btrfs-send and btrfs-receive. The take away: snapper is implementing a subsystem that offers continious snapshots.

dsnap-sync

The development of snapper has never evolved to implement a backup command, that enables the transfer to remote targets. This is a pitty. A feature request is hanging around for almost two years. Others have requested that kind of functionality, too (feature-request).

snapper is saving the snapshots of a subvolume inside a hardcoded subdir (.snapshots). Metadata are written in an xml-file, beside the taken snapshot. Sadly, dsnap-sync is needed to heel the missing gab of transfering snapper managed snaphosts to a remote target. It is coding the needed procedures to implement disk2disk2tape.

A typical dsnap-sync disk2disk run will scan for the last snapshot on the source and transfer the latest delta to the target (consuming btrfs-send and btrfs-receive, combined with ssh). It will find the relevant snapshots using snapper managed metadata (awk). If the target, that obviously need to offer a valid btrfs filesystem/subvolume, is missing the snapper alike structure, dsnap-sync will create this structure on the fly. Then the data-transfer itself is issued. The next disk2disk job will only transfer the delta between the latest source snapshot and the latest target snapshot. The goal: minimise data volume, since we can calculate the differences between the snapshots on source and on target.

Let us asume, the infrastucture is layed out correct, you are able to consume snapper functionality on the target as well, enabling the admin to implement a config rules for the backup. And since we have ro-snapshots on the target, we can use dsnap-sync again, to implement disk2tape. Source is the latest snapshot of the target config, saving this data to an lto/ltfs tape.

bachefs implementation

dsnap-sync is just a proove of concept project. Beside the fact that i do use it (lagging a better solution) it has its shortcommings. The key aspect (at least from my point of view) is the need to implement that kind of snapshot management inside the filesystem tools itself. That will render any kind of wrappers obsolete and makes error handling quite a bit more reliable.

Where we are now

bcachefs is a roling stone. But wouldn't it make sence to take the lessons learned from btrfs and realize a better solution, overcomming this shortcummings? bcachefs doesn't have a conceptual design problem with large amounts of subvolume/snapshot combinations. And a chain of ro-snaphosts will make you sleep well, if you know there is a working backup strategy.

It would be a strong argument to switch from btrfs to bcachefs, if bachefs is implementing a backup command as a first class citizen of bcachefs-tools. As far a i did understand the branch, all needed ground work is in place. No dought, stability is a must for production use. But hey, you have to start somewhere

Where we should go

Implement a btrfs-send and btrfs-receive synonym that is used by a backup command offering attributed snapshot listings.

Conclusion

This feature request is witten as a draft. It should start a discussion, targeting the realization of a practical solution to solve the issue.

onny commented 11 months ago

I use btrbk on my homeserver to remote backup the complete rootfs hast incrementally. Opened a ticket for bcachefs there but seems to be difficult to implement https://github.com/digint/btrbk/issues/537

misuzu commented 3 weeks ago

There's also zrepl (for zfs)