bit-team / backintime

Back In Time - An easy-to-use backup tool for GNU Linux using rsync in the back
https://backintime.readthedocs.io
GNU General Public License v2.0
2k stars 196 forks source link

New permissions handling in 1.2.0 causes re-backup of files that haven't changed #988

Open kosal75 opened 5 years ago

kosal75 commented 5 years ago

I have just updated from 1.1.24 to 1.2.0 (common, gnome, qt4), and making a snapshot takes hours. When downgrading to 1.1.24, all is normal (5-6 minutes, as always).

Ubuntu 18.10 with kernel 4.18.0-18.

AlexSchr commented 3 years ago

As others have mentioned,

  1. the problem comes from version 1.2 handling permissions differently from older versions.
  2. a workround, in order to make the new backup in 1.2 incremental, is to change the permissions in the last <1.2 backup.

Some scripts have been suggested to accomplish this.

A similar solution is by just using the commands getfacl and setfacl. See e.g. https://serverfault.com/a/117149 Detailed instruction:

  1. Make sure that the last backup is a <1.2 backup. If necessary, remove newer ones: Either from Back In Time or by removing directories and correcting the last_backup link.
  2. Run getfacl -R <include_path> > <temp file> cd <backup location> setfacl --restore=<temp file> for each directory in the Include setting.

Remarks:

  1. Be careful with choosing the right folder <backup location> to be compatible to the relative paths in the <temp file>.
  2. For files which did not exist at the time of the backup or which are in excluded directories, setfacl will print an error but still continue without a problem.
  3. Run the commands as root if necessary.
ghost commented 3 years ago

@AlexSchr The problem with the 1.2 permission handling is not so much that the older backups are not recognized. This could be easily corrected on the last backup (which hast to be a pre-1.2 backup), as you suggested, or even simpler with

cd $backupdir
find .  -exec chmod --reference="$dir/{}" "{}" \;

where $backupdir is the last backup of $dir.

The real problem is that the new permission handling is not what most people would want, since it leads to backing up files that have not changed if only their permissions have changed.

That is something I can't think of anyone wanting. If only the permissions of a file have changed the file shouldn't be backed up again. The way backintime handled persmissions before 1.2 was the way a backup tool should work. Therefore I think that the PR #1086 should be accepted after further review. More than that, I think that the new way of handling permissions should be completely removed from BiT.

radiantone commented 3 years ago

Yeah, this tool is completely broke. It took 5 days to backup my home directory and lots of bizarre errors along the way. From SSD to SSD. 300G

gmk57 commented 2 years ago

I wonder if the transition is smoother for backups that were already using "Full rsync mode"?

Well, I used "Full rsync mode" in BiT 1.1 since the beginning and hit no issues after upgrade to 1.2.1. Backup of 2.3 TB data took less than a minute (~100 files were really changed).

But I agree that dropping support for the previously-default mode is too drastic.

alexanderdd commented 2 years ago

Hey,

my backup (~150GB) now also takes hours, before the upgrade it took less than 20mins. (I Upgraded Linux Mint 19.3->20.3, now backintime is 1.2.1, not sure what it was before). It's not just the first one that took so long, all of them take long.

Any ideas? Should I just downgrade as suggested in https://github.com/bit-team/backintime/issues/988#issuecomment-487402125

(and great to see @emtiu working on backintime!)

alexanderdd commented 2 years ago

Gnaa. I can't get it to work at normal speed. Linux Mint 20.3 I added the backintime ppa. Now (only) these versions are available to me, neither of them works at normal speeds:

backintime-qt | 1.3.2~focal | http://ppa.launchpad.net/bit-team/stable/ubuntu focal/main amd64 Packages
backintime-qt |    1.2.1-2 | http://archive.ubuntu.com/ubuntu focal/universe amd64 Packages

I tried adding the rsync custom options "--no-perms --no-group --no-owner" in the GUI but it did not help, in neither version.

And I cannot get version 1.1.24 nor version 1.1.12-2 installed since they are not available to me via the ppa. I know how to click on a .deb but I don't know how to make a package. I tried downloading the tar from https://launchpad.net/backintime/1.1/1.1.24 and there is a makedeb.sh inside, but it showed a dependency error.

Any ideas!?! @b3nmore @AlexSchr or others?

AlexSchr commented 2 years ago

The additional options "--no-perms --no-group --no-owner" should solve the problem addressed in this thread. You chould check if you still get the problem with unchanged files repeatedly getting backed up. From a command line, you can use the ls command with the -i option to see the inode number of a file. Check on some unchanged random files in a new backup whether they get a new inode number or the same inode number as in an earlier backup.

TimGS commented 2 years ago

The additional options "--no-perms --no-group --no-owner" should solve the problem addressed in this thread.

This works for BiT 1.2.1 on Debian 11 Bullseye. Thanks.

For users like myself who have upgraded from Debian 10 to Debian 11, the earlier suggestion of downgrading to BiT 1.1.24 has problems due to dependencies on other packages that are not in the Bullseye repos.

Enjymon commented 1 year ago

A question: what will happen to the content of fileinfo.bz2 from the previous backups if one sets the "--no-perms --no-group --no-owner" options and then run a new backup? Will the changes regarding those data just be ignored or will the corresponding data be deleted all together an replaced by some "default" values? (Currently on BIT 1.3.1).

And I currently have 1.3.1 installed. According to the information in Synaptic I could either upgrade to 1.3.2 (I guess from the repository, correct?) or downgrade (return?) to 1.2.1-2 (I guess from previously downloaded package still present on my system, correct?). With all the hiccups that happened with the recent versions, does anyone have an overview of which version is currently the most reliable at the moment?

aryoda commented 1 year ago

@buhtz I think rsync and file permissions is more your area of expertise than mine ;-)

We should add this really good question + our answer to our FAQ section in the wiki (or even into the README.md).

buhtz commented 1 year ago

Currently there is no decision about the next steps regarding to that Issue. The question is good, but I don't have an answer to it yet. We need to dive deeper into the code and rsync behaviour. For me it is currently unclear what happens there. This Issue is on top of our ToDo list.

aryoda commented 1 year ago

We need to dive deeper into the code and rsync behaviour.

OK, I can can at least look into the BiT code today and publish a first indication here...

aryoda commented 1 year ago

I am trying to break down the questions into smaller tasks here:

what will happen to the content of fileinfo.bz2 from the previous backups if one sets the "--no-perms --no-group --no-owner" options and then run a new backup?

Will the changes regarding those data [file permissions] just be ignored or will the corresponding data be deleted all together and replaced by some "default" values?

@Enjymon What do you mean by "those data"? Backup files (in existing snapshots)? Or the fileinfo.bz2 file (in existing snapshots)?

Related code

Class FileInfoDict with https://github.com/bit-team/backintime/blob/110d82aa1cf6717424c39b72366326b5977e3088/common/snapshots.py#L2024-L2034

The tuple elements of a dict item consist of:

fileInfo property to load and save fileinfo.bz2 https://github.com/bit-team/backintime/blob/110d82aa1cf6717424c39b72366326b5977e3088/common/snapshots.py#L2448-L2458 https://github.com/bit-team/backintime/blob/110d82aa1cf6717424c39b72366326b5977e3088/common/snapshots.py#L2485-L2486

backupPermissions(): takeSnapshot() does unconditionally call this method to backup all file and folder permissions into fileinfo.bz2:

https://github.com/bit-team/backintime/blob/110d82aa1cf6717424c39b72366326b5977e3088/common/snapshots.py#L1190 https://github.com/bit-team/backintime/blob/110d82aa1cf6717424c39b72366326b5977e3088/common/snapshots.py#L978-L984

RestorePermissions(): https://github.com/bit-team/backintime/blob/110d82aa1cf6717424c39b72366326b5977e3088/common/snapshots.py#L329-L336

restore() always restores the permissions if a FileInfoDict exists: https://github.com/bit-team/backintime/blob/110d82aa1cf6717424c39b72366326b5977e3088/common/snapshots.py#L518-L533

aryoda commented 1 year ago

does anyone have an overview of which version is currently the most reliable at the moment?

@Enjymon I dare to say that the current dev version is the most reliable at the moment since we are currently mainly fixing bugs to prepare a stabilization release. So If you can wait for the next release (scheduled for Jan, 2023) this would be perfect otherwise use the dev version by installing from the source.

You can watch our list of changes.

Enjymon commented 1 year ago

@aryoda By "those data" I mean the permissions (of the files that have been backed up), i.e. the information contained in the fileinfo.bz2 file.

buhtz commented 1 year ago

Hello folks, I try to reproduce that problem but I'm not able.

I'm using latest BIT from main branch and Debian (12) testing. I created two files with rights 666 and 444. Between multiple snapshots I also added other files and modified them.

No problem occurred. The snapshots happens as expected. Also the inode numbers of the files are as expected.

Myself I remember foggy expecting this behavior (re-backup of files that haven't change), too.

I would be glad if someone can provide a step by step setup to reproduce this. Then I'm able to dive deeper into the thing.

Germar commented 1 year ago

@buhtz this only happen if you have snapshots made with BiT version previous to 1.2.0 without activated Full rsync mode and create the very first new snapshot with BiT version 1.2.0 or later. That's why I didn't notice it when I introduced those changes.

emtiu commented 1 year ago

@buhtz this only happen if you have snapshots made with BiT version previous to 1.2.0 without activated Full rsync mode and create the very first new snapshot with BiT version 1.2.0 or later. That's why I didn't notice it when I introduced those changes.

I'm not sure that's the whole story. I do remember encountering this problem with new profiles created by 1.2.0. The workaround I employed was to manually introduce the --no-perms --no-group --no-owner rsync options.

We really need to reproduce this one. I don't currently have a good VM/testing setup at hand, unfortunately.

buhtz commented 1 year ago

I also confirm that this happens also with newer versions. Myself I experienced it from time to time with 1.3.* versions. But i have no clue how to reproduce it.

Germar commented 1 year ago

Maybe after adding --no-perms --no-group --no-owner or removing it again.

buhtz commented 1 year ago

Maybe after adding --no-perms --no-group --no-owner or removing it again.

In my environment I never modified the rsync arguments and always used the defaults.

emtiu commented 1 year ago

We might have a better shot at isolating the problem if we focused on #994 first. I have a feeling that the root cause is the same, I've seen #994 in the wild myself, and it has more deterministic triggering conditions.

buhtz commented 1 year ago

Just a quick n dirty note: I realized that there is a --chmod=Du+wx in our rsync call (debug output from a SSH snapshot profile). Not sure but the D indicates it affects directories only. I don't know why it is there and when it was put in there. Should investigate further.

buhtz commented 1 year ago

I do read the whole issue thread and wonder if I got this right.

We have a lot of comments here and there are multiple issues and problems addressed. Other relevant tickets are linked. But the problem described here happens only when migrating from <1.2 to >=1.2 BIT. Am I right so far?

emtiu commented 1 year ago

We have a lot of comments here and there are multiple issues and problems addressed. Other relevant tickets are linked. But the problem described here happens only when migrating from <1.2 to >=1.2 BIT. Am I right so far?

To my knowledge, yes, that's correct.

I can't remember if new installations of >=1.2 are also affected, so we might treat that as a "maybe" for the moment.

danielaixer commented 1 year ago

I do read the whole issue thread and wonder if I got this right.

We have a lot of comments here and there are multiple issues and problems addressed. Other relevant tickets are linked. But the problem described here happens only when migrating from <1.2 to >=1.2 BIT. Am I right so far?

That might be the case. I've recently started using Ubuntu 20.04 with BIT 1.2.1-2 and I have this issue. Adding the PPA and upgrading to BIT 1.3.3-3 hasn't helped. I detected the issue because the same backup on the very same machine was taking wayyy longer, and then I noticed that the target drive was getting filled up way faster than it should.

My old setup (that I can still boot into) is Ubuntu 14.04 with BIT 1.0.34 where the same backup profile still works fine, even with more included paths. I don't think this matters, but one of the source paths is an NTFS drive. However, the target is EXT4, so there should not be permission issues specific to my case.

As @emtiu, I can also confirm that the workaround of adding --no-perms --no-group --no-owner is effective. I was going nuts, so thank you a lot.

aryoda commented 12 months ago

And what happens on a drive not supporting Linux file permissions?

BiT still stores permissions inside the fileinfo.bz2 like in previous versions

But is --perms still used then (which may cause a full recopy of the source files if the target file system has a different "umask").

I think we should test this to see the behavior (full recopy or not).

aryoda commented 11 months ago

Summary:

I vote to undo the new permission handling (--perms --groups --owner --executability options) introduced in BiT v1.2.0 to get rid of major issues related to this change:

Reasons

The intentions of the new permission handling were

  1. to let rsync instead of BiT handle the backup of permissions
  2. perhaps also to get rid of BiT's own handling of permission backups in the fileinfo.bz2 file to allow restores even without BiT (just by using rsync or cp). This objective is not achievable since we need fileinfo.bz2 for target file systems that do not support the same permissions like the source folders (or have different users and groups).
  3. perhaps also to protect the access to files in the backup with the same permissions as in the source (also not achievable, see the prev. point)

Effectively the new permissions handling led to problems like

  1. a full backup of all files (=duplicated) in the first snapshot after updating to BiT v1.2.0++ if older snapshots pre v1.2.0 were taken without the old full sync mode setting. This takes quite long and wastes disk space on the backup target
  2. every change of file permissions leads to a new copy of the file in the next snapshot (not hardlinked!) even though the file itself is unmodified (permissions are metadata)
  3. being affected by an bug in rsync (open since 2017): Deleting hardlinks during deleting an old snapshot resets the file permissions of the same hardlinked file in other snapshots (causing a full copy of the file in the next snapshot due to "changed permissions"). The "smart remove" feature of BiT triggers this unwanted behavior whenever an old snapshot is deleted. See https://github.com/bit-team/backintime/issues/994#issuecomment-1709265390
  4. Edit: Mount options of the backup target may interfere with rsyncs permissions transfer (eg. for SMB and NTFS-g3 it is possible to specify user=...,group=...,umask=...,dmask=... so the permissions are almost ever different between source and target causing a full copy in every snapshot instead of using hardlinks). See eg. #1164.

Alternatives

  1. Introducing the new permission handling did unintentionally break the backup semantics of BiT so it would be good to

    • make the old permission handling semantics the default again
    • also let the user decide if and when to use the new permission handling

We have PR #1086 for this (thanks to @b3nmore for preparing this PR!).

  1. Fix only #994 by using rm -rf instead of rsync --delete (as workaround for the rsync bug).

This is only a partial fix, requires a lot of scenario testings and would not solve the other issues (slow first snapshot; full file copy if permissions are changed).

Impact of a non-fix

Next steps

@Germar @buhtz @emtiu I think it is time now to take a decision here -> RfC :smile:

emtiu commented 11 months ago

Thank you for the deep analysis, @aryoda, I agree with every point of it.

Your proposed solution also minimizes the necessary testing of the handling of existing backups, because there will only be a few cases:

  1. existing backups from <1.2.0: no change
  2. existing backups from >=1.2.0 with the popular --no-perms --no-group --no-owner workaround: almost no change (settings handling only)
  3. existing backups from >=1.2.0 default handling: testing needed

Since this is an "existential" issue for BiT, I think @Germar's input is especially important.

buhtz commented 11 months ago

Awesome work! Thanks a lot for diving into this. ❤️ As a disclaimer I have to say I do not understand all details. But based on your summary I would support your proposal. 🚀

One question: It seems that the rsync-upstream bug (https://bugzilla.samba.org/show_bug.cgi?id=12806) is not recognized by rsyncs upstream maintainer Wayne Davison.

Jürgen, did you contacted Wayne about our issues? And did you point him to his own upstream bug?

Second question: Would it solve our problems if the upstream bug would be fixed?

emtiu commented 11 months ago

Would it solve our problems if the upstream bug would be fixed?

It would definitely solve #994. We don't understand #988 and #1437 well enough yet to say. Maybe the upstream fix would resolve those, maybe not.

In any case, it would potentially take a long time for the fixed version of rsync to appear in all distros. Changing/fixing the behavior of BiT is much more under our control.

The only point that makes me a little nervous is the handling of existing backups. We need to be very thorough in testing that. But I think that's within our capabilities.

aryoda commented 11 months ago

did you contacted Wayne about our issues? And did you point him to his own upstream bug?

Not yet, I have just sent a public request at the rsync mailing list (but no developer responded so far) and have then bumped the issue by adding my script to reproduce the bug (also no response so far).

Second question: Would it solve our problems if the upstream bug would be fixed?

As @emtiu wrote: Not reliably. Furthermore it does not fix things I have just added in my above analysis: -> Permission mappings in the mount options cause permanent re-backups (very in-transparent to the end user!).

The only point that makes me a little nervous is the handling of existing backups. We need to be very thorough in testing that. But I think that's within our capabilities.

Yes, it needs testing, but it is a direct "downgrade" (= does no longer treat permissions changes as a change) so it seems much less risky than keeping the new permissions handling.

emtiu commented 11 months ago

I vote to undo the new permission handling (--perms --groups --owner --executability options) introduced in BiT v1.2.0 to get rid of major issues related to this change:

How about someone create an experimental branch that implements this fix/revert? It would be very useful for testing, and we're going to need a lot of testing :)

aryoda commented 11 months ago

How about someone create an experimental branch that implements this fix/revert?

I think I can do this but I need some time (I guess until end of October - I am in the middle of another major roll-out)

It would be very useful for testing

We need a test plan for that with a matrix of backup and restore scenarios:

We then should automate these tests

Eg. my MRE bash script test.sh does this just to reproduce the rsync bug and could be used as a basis for test automation (or any other scripting language).

emtiu commented 11 months ago
  • With- and without existing snapshots
  • Existing snapshots with old or new permission handling
  • New snapshots with old or new permission handling

I would even go so far as to do some tests with real, large datasets on real hard-drives. Some problems you only notice in "real world" scenarios. I have plenty of large external drives lying around to do that :)

buhtz commented 11 months ago

I fully agree that we should do heave real-world testing here. Of course I'll support it when the time comes.

capybara-overdose commented 8 months ago

STILL broken, I just tried this mess again - "--no-perms --no-group --no-owner" doesnt work around anything, it filled a 2TB HDD with duplicate snapshots in like a week.

Sort it out already it's been literal years, good greif

aryoda commented 6 months ago

I think we could automate testing of the snapshot source and target folders quite "easily" (not only for this issue) similar to how the rsync test suites work:

https://github.com/WayneD/rsync/blob/2f9b963abaa52e44891180fe6c0d1c2219f6686d/testsuite/rsync.fns#L247

It basically uses diff to compare

buhtz commented 6 months ago

F***ing awesome! :partying_face: :tada: :pinata: Never realized that rsync itself could have a test suite. This is a very good "documentation" of its behavior. I see light at the end of the tunnel... :sun_with_face:

ACAwebbuilder commented 3 months ago

Hi! I found this bug while trying to find a solution to an issue I am having with BiT 1.2.1. I upgraded my system from Ubuntu 18.04 to Ubuntu Server 22.04. I then installed BiT 1.2.1 (which was an upgrade for me). I ran the initial scan and it went well. Since then, it is running a full scan frequently (not every time, but most times). The previous version just updated what had changed.

Is this happening because the bug wasn't fixed before this update? Or is there possibly something else going on?

Thank you for any thoughts/suggestions.

buhtz commented 2 months ago

No one is assigned to this Issue. So before acting on it I vote to move it into "2nd release from now" milestone.

Ronkn commented 2 months ago

So what's the deal with this? Is it an issue or is it intentional? The read me says the first backup takes a while but from what I can tell this "issue" is intentional?

Just trying to figure out if action will be taken on resolving this, or if I should hold back package updates on this to keep it below 1.2.0.

emtiu commented 2 months ago

This behavior is not intentional, it's an unfortunate consequence of a complex set of circumstances that involve backintime, rsync, and filesystem permissions.

We would love to fix it, but the possible solutions outlined here all have serious drawbacks. Above all, there's lots of users with different configurations and existing backups out there, and we need to make sure we don't break their stuff.

For the moment, with its limited resources, the team has been undecided on how to proceed. Meanwhile, a workaround exists by manually adding --no-perms --no-group --no-owner to the rsync options.