Open Massimo-B opened 1 year ago
I'm using restic or borg to store my backup on Synology instead of sending snapshots. It compresses and deduplicates better, and is not affected by copied files which would not deduplicate with btrbk. IOW, bees is not needed. For the backup repository, I'm using btrfs snapshot retention so a software bug won't make the repository unusable.
Hi, I know your preference for borg or restic. Before btrfs I was using dar, via highly complicated scripts and catalogs. Even if borg does a lot better than dar, like sending to external SSH repos, it will always only deduplicate it's own backup chain. With the high compression and differential catalogs (similar to CoW) it would never be as transparent as btrfs snapshots. AFAIK you always need to extract/restore the parts you need. That's why borg, restic or dar are real backups and btrfs with snapshots actually is not and never as safe as a real backup. For this I have many different btrfs mirrors which I hope won't break all together.
Using btrfs snapshots it's always as transparent as a filesystem can be. For instance I often need to use dirdiff on whole config trees over several snapshot eras, I often need to grep over histories of all machines:
grep foobar /mnt/usb/mobiledata/snapshots/*/home/*/mb/.history-*
This comfort I pay with high compression load (compress-force=zstd:15 on archive btrfs, compress-force=zstd:3 on working btrfs) and bees load.
I could use borg to backup my machine to the planned NAS. But I'm archiving backups of many machines to a central archive (currently a 4TB USB3.2 NVMe, next will be the NAS with around 16TB single hdd for low power consumption, mirrored to another 16TB at a different location). All the machines have large identical parts which currently can only be dedupped with bees.
Next, you say you snapshot the borg repo on the NAS with retention. I guess the snapshotting of the compressed binary borg repo is not optimal with CoW, same issue as with VirtualMachine disks. Ok, if borg adds diff files for every change, it could work with btrfs snapshots. But for me still uncomfortable as I can't access the archived trees without extracting from backups.
I'm completely new to Synology, don't have a device yet. I'm not sure if they have a build toolchain to build bees for it. If running bees, it would need at least a big memory upgrade to the default setup. For 16TB the hashfile would also take some GBs.
Btw. is there any chance bees could run on my desktop machine and working on the remote btrfs blockdevice via SSH or something, which would have a bad performance probably.
Next, you say you snapshot the borg repo on the NAS with retention. I guess the snapshotting of the compressed binary borg repo is not optimal with CoW, same issue as with VirtualMachine disks. Ok, if borg adds diff files for every change, it could work with btrfs snapshots. But for me still uncomfortable as I can't access the archived trees without extracting from backups.
Borg and restic use one file per checksum where each checksum results in a dynamically sized block (content-based slicing with rolling hash). It stores around 287 TB worth of historic backup snapshots deduplicated, compressed and encrypted to 6.6 TB. That means, btrfs itself cannot do anything here: neither dedup nore compress. Snapshotting this repository is just a protection against accidental or unintended breakage of the repository.
For my whole system, I'm using snapper for fast access to older versions of files. Bees takes care of deduplicating files over these snapshots which were eventually copied or re-appeared after some time.
So borg is actually really a cold data bunker for me.
For Synology, I think they may have a built-in deduplicator by now - at least there's a job called "Space Reclamation Schedule". There's no description of what it actually does except for "reclaiming storage space", but I'm not sure if it just deletes old snapshots, or if it actually deduplicates snapshots on a schedule. From the description, it reads like there's more to it than just delete snapshots.
As you say, using borg is outside of btrfs scope, it doesn't make use of btrfs CoW, snapshots, compression and deduplication. Actually borg could use its own fs or database. For me the approach is different, I like to make full use of btrfs features. I like to deduplicate between different snapshot lines coming from different machines and also against the NAS shared area as some parts are duplicate as well in my machines $HOME snapshots and the NAS shared. Actually everything that bees offers.
Snapper, are you on OpenSuse? When I compared snapper to btrbk I was going for btrbk as it was more powerful eventhough only a 1-man project. btrbk can send snapshots, which snapper wasn't able to. I have some OpenSuse machines using their snapper by default but I additionally added btrbk to it for sending these snapshots.
On OpenSuse I also compiled bees as standalone binary. Still interested if someone got bees compiled on Synology and if Synology kernels would be supported by bees at all.
I think Synology uses their custom btrfs patches and pretty old kernels - so I suppose better not run bees on it: It would cause running very very untested codepaths. Other than that, those machines are extremely stable. They have their own snapshot send/receive client called "snapshot replication" which allows sending btrfs snapshot differences over to a remote site. We are using that to mirror iSCSI LUNs every 15 minutes to a remote location.
Snapper is running on Gentoo for me. I think snapper and btrbk do not contradict but rather complement each other: snapper does the snapshots, btrbk can send them. Similar to how Synology implements snapshots separately from replication (both using btrfs features). Since I am not sending snapshots, snapper is doing exactly what I need: reverting an "oops, I didn't want to do that" while borg does the heavy lifting of keeping daily backups for almost 4 years now. That kind of retention probably becomes hard for btrfs, and one single bug can destroy the whole retention. In borg, I could simply add a backup from another machine and there's a chance it auto-heals itself. So in that sense: btrfs snapshots aren't a real backup for me - at least, if the source is btrfs, too: bugs or hardware faults that can lead to FS crashes later are likely to be mirrored over by sending snapshots. I've seen that kind of corruption and don't need it again. But snapper is a real cool local service, and snapshots are great for replication. But it's exactly that: replication, not a backup.
I haven't done any testing on Synology and I don't know of anyone who has done testing with bees and reported the results.
While bees can run on some very old kernels, I wouldn't attempt to use a kernel earlier than 5.4 with btrfs due to the impact on system availability and data corruption issues (the kernel bug tracking table will give you an idea what to expect). Earlier kernels got us through the years from 2014 to 2019, but there were some losses in that era that I would not care to repeat.
Some fixes have been backported to the older LTS kernels 4.4 and 4.9, which are of the era that Synology uses, but I wouldn't trust that the fixes have been merged into the kernels Synology is shipping. It's essential to verify that the fixes are present in the kernel sources.
In theory there could be bugs introduced from the Synology side as well, but I have no data either way on that. I don't know of any reason Synology would need to make changes to btrfs that would affect TREE_SEARCH
or LOGICAL_INO
, so I wouldn't expect any new bugs to be introduced this way, but that simply means any bugs that do exist are unexpected. LOGICAL_INO
is fragile at the best of times, and particularly sensitive to changes that appear innocuous and isolated. Some of the listed kernel bugs can take months or years to show up under "normal" workloads, but show up several times a day when running LOGICAL_INO
.
I'd recommend setting up a pilot Synology instance and run a stress workload on it for a few months (including disaster recovery and hardware failure scenarios) before considering it for production use with bees. That kind of canary setup is the source of most of the data in bees's kernel bugs list.
An up-to-date 4.19 kernel or 4.14 kernel might work with some loss of performance or availability, i.e. it will be slow and crash more often, but it won't corrupt the data.
That all sounds to me like abusing a Synology... and only the stronger models actually support btrfs which is getting expensive. I did not get any response to this specific question from the Synology forums. This all makes me redirecting now to a custom made NAS on a powersaving platform with some familiar Linux distribution, either based on Intel® Core™ i3-N300 or some older AMD G-Series GX-415GA...which might still be sufficient for btrfs and bees. But this is another story called "Minimum hardware requirements for bees" and might get some new thread...
So in the context of the original question:
Yeah, Synology NAS is designed around the software features implemented. Btrfs should be supported by most current models. But only the stronger models support virtual machines etc. Those models might support bees from a performance/resource perspective. But I still thing that the kernel is not tested for this and thus bees should be avoided on these machines.
The minimum hardware requirement for bees is pretty small. bees can run one thread on a 512M Raspberry Pi and can save space on the SD card after a dist-upgrade or when run overnight. Even a 32-bit CPU can get some deduplication on a large filesystem.
For that matter, the "minimum hardware requirements" for a file server are pretty small too. You could make a tiny, slow, but working file server out of an Arduino with TCP, WiFi, and storage modules, but you won't be able to use it for video editing. You could maybe play mp3 files from it.
To figure out the minimum hardware requirement, we need to know the workload requirements, and that's different for every user.
A given CPU+RAM+storage stack will support scanning N GiB per day and can dedupe M GiB per day. bees mostly does those two things, so if you double the CPU and IO speeds, you double the bees capacity; however, if you let bees run twice as long on the slower hardware, it will produce the same result as the faster hardware (this is different from some other deduplicating systems, where if you run out of capacity, you get no dedupe at all). If you have a fileserver that's mostly idle, you'll need less CPU and RAM for bees than a fileserver that's ingesting a TiB of new data every day.
If you have a TiB of new data every day, and you need to have the entire day's data ingest completely deduplicated in time for the next morning, you might have trouble finding hardware fast enough (or bees doesn't use it efficiently enough). If you just need to slow down the rate of storage growth by 10%, then you can run on 10x slower hardware (i.e. after deduping 100 GiB per day, your total data size grows by only 900 GiB per day), or you can run bees for only 2.4 hours per day on the fast hardware. Maybe you have fast hardware with a 3-hour daily maintenance window, so you accept 10% dedupe because the server was idle during the window anyway, and 10% is better than 0%.
Most users don't want bees to occupy 100% of their file server's capacity, so there has to be leftover capacity for the other things the fileserver has to do. e.g. you might want no more than 25% of the server dedicated to dedupe--or no more than 5%. That will affect the calculation, resulting in either lower dedupe target rates, or a bigger server hardware requirement.
So the question is really how much hardware do you need to run bees against some reference workloads? Ideally we'd have a standard performance benchmark that users could run and post their results with a hardware description, and we'd collate the results in a table so users can pick the row of the table that's closest to their situation.
@kakra already mentioned the idea of Bees on Synology NAS: https://github.com/Zygo/bees/issues/153#issuecomment-703674688
As of today is there anything new about that? I don't own a Synology yet and don't know about their kernel version and patches. Could it be possible to have bees on such a platform?
I think about getting some big btrfs-based NAS, I still compare running a self-maintained Linux machine or some commercial solutiuon, and among the commercial solutions there is actually only Synology going the Linux-based-btrfs-way.
Some requirements will be, that I would also like to send btrfs snapshots to that big btrfs which seems to be possible on Synology via key-based SSH login. Then I also like to have bees and maybe btrbk.