Since Snapshots hit the ground of bcachfs this issue arises quite naturaly form the point of administration of a bcachefs driven production system.
Imaging the common szenario:
the production system is servering data trunks
the trunks are organized as read-write subvolumes on the given drive-pool
(eg: data, home, var, usr).
A backup strategy needs to take care that all relevant data are saved on a regular, timedriven scedule
And of course, dealing as a carfull administrator we like to asure, that we have a media brake in our backup plan (reading: disk-to-disk-to-tape).
bcachefs right now offers all needed building blocks, missing conceptually the backup strategy.
A reference
Coming from a btrfs environment, I personaly use dsnap-sync. This is a simple posix-shell implemented tool, that is consuming btrfs-send/btrfs-receive to transfer snapshots and snapper to handle the needed snapshot. snapper is a c++ service/library, that organizes snaphosts building attributed configs.
snapper
The key feature of snapper is its ability to offer / define config specific timer and cleanup rules. This rules take care to manage config applicable backup rules (e.g: create hourly sanpshots during the day, keep a snaphost per day (max 6), keep a snaphost per week (max 3), keep a snapshot per month (max 11), keep a snapshot per year (max 2).
snapper itself is using btrfs ro-snapshots as it target objects. It will transfer incremental data differences using btrfs-send and btrfs-receive. The take away: snapper is implementing a subsystem that offers continious snapshots.
dsnap-sync
The development of snapper has never evolved to implement a backup command, that enables the transfer to remote targets. This is a pitty. A feature request is hanging around for almost two years. Others have requested that kind of functionality, too (feature-request).
snapper is saving the snapshots of a subvolume inside a hardcoded subdir (.snapshots). Metadata are written in an xml-file, beside the taken snapshot.
Sadly, dsnap-sync is needed to heel the missing gab of transfering snapper managed snaphosts to a remote target. It is coding the needed procedures to implement disk2disk2tape.
A typical dsnap-sync disk2disk run will scan for the last snapshot on the source and transfer the latest delta to the target (consuming btrfs-send and btrfs-receive, combined with ssh). It will find the relevant snapshots using snapper managed metadata (awk). If the target, that obviously need to offer a valid btrfs filesystem/subvolume, is missing the snapper alike structure, dsnap-sync will create this structure on the fly. Then the data-transfer itself is issued.
The next disk2disk job will only transfer the delta between the latest source snapshot and the latest target snapshot. The goal: minimise data volume, since we can calculate the differences between the snapshots on source and on target.
Let us asume, the infrastucture is layed out correct, you are able to consume snapper functionality on the target as well, enabling the admin to implement a config rules for the backup.
And since we have ro-snapshots on the target, we can use dsnap-sync again, to implement disk2tape. Source is the latest snapshot of the target config, saving this data to an lto/ltfs tape.
bachefs implementation
dsnap-sync is just a proove of concept project. Beside the fact that i do use it (lagging a better solution) it has its shortcommings. The key aspect (at least from my point of view) is the need to implement that kind of snapshot management inside the filesystem tools itself. That will render any kind of wrappers obsolete and makes error handling quite a bit more reliable.
Where we are now
bcachefs is a roling stone. But wouldn't it make sence to take the lessons learned from btrfs and realize a better solution, overcomming this shortcummings? bcachefs doesn't have a conceptual design problem with large amounts of subvolume/snapshot combinations. And a chain of ro-snaphosts will make you sleep well, if you know there is a working backup strategy.
It would be a strong argument to switch from btrfs to bcachefs, if bachefs is implementing a backup command as a first class citizen of bcachefs-tools. As far a i did understand the branch, all needed ground work is in place. No dought, stability is a must for production use. But hey, you have to start somewhere
Where we should go
Implement a btrfs-send and btrfs-receive synonym that is used by a backup command offering attributed snapshot listings.
Conclusion
This feature request is witten as a draft. It should start a discussion, targeting the realization of a practical solution to solve the issue.
I use btrbk on my homeserver to remote backup the complete rootfs hast incrementally. Opened a ticket for bcachefs there but seems to be difficult to implement https://github.com/digint/btrbk/issues/537
Ratio / feature request
Since Snapshots hit the ground of
bcachfs
this issue arises quite naturaly form the point of administration of abcachefs
driven production system.Imaging the common szenario:
data
trunksbcachefs
right now offers all needed building blocks, missing conceptually the backup strategy.A reference
Coming from a btrfs environment, I personaly use
dsnap-sync
. This is a simple posix-shell implemented tool, that is consumingbtrfs-send/btrfs-receive
to transfer snapshots andsnapper
to handle the needed snapshot.snapper
is a c++ service/library, that organizes snaphosts building attributed configs.snapper
The key feature of snapper is its ability to offer / define config specific timer and cleanup rules. This rules take care to manage config applicable backup rules (e.g: create hourly sanpshots during the day, keep a snaphost per day (max 6), keep a snaphost per week (max 3), keep a snapshot per month (max 11), keep a snapshot per year (max 2).
snapper
itself is using btrfsro-snapshots
as it target objects. It will transfer incremental data differences using btrfs-send and btrfs-receive. The take away:snapper
is implementing a subsystem that offerscontinious snapshots
.dsnap-sync
The development of snapper has never evolved to implement a
backup
command, that enables the transfer to remote targets. This is a pitty. A feature request is hanging around for almost two years. Others have requested that kind of functionality, too (feature-request).snapper
is saving the snapshots of a subvolume inside a hardcoded subdir (.snapshots
). Metadata are written in an xml-file, beside the taken snapshot. Sadly,dsnap-sync
is needed to heel the missing gab of transfering snapper managed snaphosts to a remote target. It is coding the needed procedures to implement disk2disk2tape.A typical
dsnap-sync
disk2disk run will scan for the last snapshot on the source and transfer the latest delta to the target (consumingbtrfs-send
andbtrfs-receive
, combined withssh
). It will find the relevant snapshots using snapper managed metadata (awk
). If the target, that obviously need to offer a valid btrfs filesystem/subvolume, is missing thesnapper
alike structure,dsnap-sync
will create this structure on the fly. Then the data-transfer itself is issued. The next disk2disk job will only transfer the delta between the latest source snapshot and the latest target snapshot. The goal: minimise data volume, since we can calculate the differences between the snapshots on source and on target.Let us asume, the infrastucture is layed out correct, you are able to consume
snapper
functionality on the target as well, enabling the admin to implement a config rules for the backup. And since we have ro-snapshots on the target, we can usedsnap-sync
again, to implement disk2tape. Source is the latest snapshot of the target config, saving this data to an lto/ltfs tape.bachefs implementation
dsnap-sync
is just a proove of concept project. Beside the fact that i do use it (lagging a better solution) it has its shortcommings. The key aspect (at least from my point of view) is the need to implement that kind of snapshot management inside the filesystem tools itself. That will render any kind of wrappers obsolete and makes error handling quite a bit more reliable.Where we are now
bcachefs
is a roling stone. But wouldn't it make sence to take the lessons learned frombtrfs
and realize a better solution, overcomming this shortcummings?bcachefs
doesn't have a conceptual design problem with large amounts of subvolume/snapshot combinations. And a chain ofro-snaphosts
will make you sleep well, if you know there is a working backup strategy.It would be a strong argument to switch from
btrfs
tobcachefs
, ifbachefs
is implementing a backup command as a first class citizen ofbcachefs-tools
. As far a i did understand the branch, all needed ground work is in place. No dought, stability is a must for production use. But hey, you have to start somewhereWhere we should go
Implement a
btrfs-send
andbtrfs-receive
synonym that is used by a backup command offering attributed snapshot listings.Conclusion
This feature request is witten as a draft. It should start a discussion, targeting the realization of a practical solution to solve the issue.