koverstreet / bcachefs

Other
707 stars 73 forks source link

Feature request - being able to produce an instant `diff` between 2 snapshots #742

Open mbideau opened 2 months ago

mbideau commented 2 months ago

Hi,

I follow your work since the beginning (almost), but not much lately (because of lack of time) and want to thank you for all the good work. I have never used bcachefs yet, but I use BTRFS a lot (since a very long time) and develop tools around it. One of them, btrfs-diff-go, is a utility that produces diff-like reports. And with BTRFS I have an "unsolvable" issue causing false changes to be reported, rendering the utility almost useless. Those false changes are caused by an extent having changed, but not the user content associated with it.

Example: the content of the file /path/to/myfile hasn't changed (between 2 BTRFS subvolumes), but an extent "holding" the file has changed (ie: size), and now the send/receivedcommands (if I want to "sync" those 2 BTRFS subvolumes) have to report the change of the extent regarding that path, and because I use (hack-ishly) those two programs and their protocols to produce the diff (without actually sending/receiving anything), my diff-like report contains a line saying that file have changed when really its content have not (only the "shape" of the extent "holding" it).

I have an issue describing all the logic required to understand BTRFS way to handle this but I have kind of given up on it, because I don't have enough time to digest all of those thousands of lines of comments of the code.

I might be wrong though (about BTRFS stuff), I am just a hobbyist, not an expert in FS.

I can't tell if bcachefs will have the same issue, because I have not delve into it enough, and I don't know how the concept of extent is implemented, but I suspect it might, hence that "feature request" issue, to warn about it.

I can see in the roadmap that send/receive is planed, and a diff utility is very interesting to me along those ones. I use it to prune my old backups in a smarter way (by analyzing the content that have changed, and applying some rules regarding it).

Thanks again. Best regards, PS: sorry if my english is bad, I am a frenchy doing my best, late in the night :sweat_smile:

mbideau commented 2 months ago

I am sorry, I thought I was on the bcachefs-tools issue tracker, where it should have been posted, right ? Do you want me to re-post it there ?

presto8 commented 1 month ago

diff -qr /path/to/snap1 /path/to/snap2 may work as a substitute until this feature is added?

mbideau commented 3 weeks ago

@presto8 Thanks for your feedback but this is not practical in my case, because it would last for hours to compare my data. I want "near" instant diff. Not hours long diff.