Open Thesola10 opened 2 weeks ago
Sure, if it ever comes up. As far as I know, bees has never been the root cause of any data loss event. Do you have one to report?
btrfs has occasionally had kernel releases with data-losing bugs. Running any software which modifies the filesystem on such a kernel can cause data loss. Some of these bugs are documented on the kernel bugs table but they can affect many applications, not just bees. For example, one bug on the bugs list carries a small but non-zero risk of total filesystem data loss for every write operation involving btrfs--it's the one with the big data corruption warnings at the top of the page.
It would be difficult to maintain the accuracy and integrity of the label on a bees github issue. Even when there is a kernel bug that is triggered by one of the fixed set of operations that bees does (tree search, extent backref lookup, inode name lookup, file open, file stat, data read, data write, data write with compression, and deduplicate), those operations are fundamental to what bees does. The data loss risk assessment would apply only to the combination of bees with specific kernel versions, and no change in bees would add or remove the data loss risk from the combination of bees with a bad kernel version. Only changes to the kernel, not bees, can fix a kernel bug.
I don't think it's feasible or reasonable for the bees project to take on the responsibility of tracking all data-losing kernel bugs in btrfs--and especially not if the scope is expanded to cover related subsystems like sata disks or lvm which btrfs may depend on for data integrity. I make a nominal effort to test all mainline kernel releases with current bees versions (mostly to protect data directly in my care), and I update the published pages when I find new issues. I'm willing to republish bugs others have identified if I can confirm the issue. I'm never going to be a replacement for proper and timely kernel QA.
bees has some low-level knowledge of btrfs filesystem structure, but it only ever reads these structures. It is up to the kernel to perform all modifications recommended by bees--and the kernel can (and often does) reject these recommendations if they would alter data. There are a number of risk mitigation design features as well:
O_RDONLY
so that a bug causing a write to a wrong FD in the bees process will fail instead of overwriting user dataO_NOFOLLOW
and O_TMPFILE
to further reduce symlink and TOCTTOU attack surface (bees is a bit behind here, since kernel 5.6 there is openat2
which has stronger cross-device and symlink controls)
I am currently managing a NixOS NAS at home and am looking to gauge the risk factor for
bees
. So far I haven't had data loss using it on my laptop, but while I do have backups, I can't afford to risk server data for space savings.In the interest of trust and accountability, would it be possible to create a dedicated GitHub bug tracking label for data loss incidents related to
bees
? I believe this is warranted for a low-level filesystem tool.Even if it ends up unused -- in fact, I sincerely wish for this label to go unused :p