Error "file inode changed" on ZFS snapshot

benurb commented 2 years ago

Have you checked borgbackup docs, FAQ, and open Github issues?

Yes. #6650 is about the same problem, but I'm not backing up a network filesystem.

Is this a BUG / ISSUE report or a QUESTION?

~BUG~ QUESTION

System information. For client/server mode post info for both machines.

Your borg version (borg -V).

borg 1.2.0

Operating system (distribution) and version.

Ubuntu 20.04.4 LTS

Hardware / network configuration, and filesystems used.

Core i3 7320 2x 16GB Kingston DDR4-2400 ECC RAM Filesystem for backup source: ZFS (snapshot) Filesystem for backup target: ext4 Backup is not sent over network

How much data is handled by borg?

Around 2 TB per archive, around 152 TB in repository (2.5 TB deduplicated)

Full borg commandline that lead to the problem (leave away excludes and passwords)

BORG_RELOCATED_REPO_ACCESS_IS_OK=yes borgbackup create --compression lz4 --verbose --stats ${PATH_BACKUP}::$(/bin/date '+%Y-%m-%d_%H-%M') \
        /root/ \
        /storage/iobroker/.zfs/snapshot/borg/
// ... + a few more ZFS snapshot locations

Describe the problem you're observing.

UPDATE: As it turns out that is probably intentional behaviour by ZFS. While investigating this I found out, that when accessing snapshots via the hidden .zfs directory, a temporary mount is created. That leads to the error described below, because the inode changes between the unmounted and the mounted snapshot.

My fix is to not rely on the automount feature, but instead mount the snapshot manually. I changed my backup script to mount all snapshots I want to backup into /mnt/zfs/<snapshot name> and the problem is fixed.

Maybe this info helps someone that's why I didn't delete the issue, but changed it to be a question.

Original Text: As far as I can see starting with borgbackup 1.2.0 the backup keeps failing with "file inode changed" errors:

Creating archive at "/mnt/backup_borg/files::2022-04-25_10-12"
/storage/backup/.zfs/snapshot/borg: file inode changed (race condition), skipping file
/storage/iobroker/.zfs/snapshot/borg: file inode changed (race condition), skipping file
/storage/minecraft/.zfs/snapshot/borg: file inode changed (race condition), skipping file
/storage/music/.zfs/snapshot/borg: file inode changed (race condition), skipping file
/storage/scans/.zfs/snapshot/borg: file inode changed (race condition), skipping file
------------------------------------------------------------------------------
Repository: /mnt/backup_borg/files
Archive name: 2022-04-25_10-12
Archive fingerprint: 9e3e209c74de5a120eeccb34d301da677f2d6eb3c1fb95c5c930910fe8ddb7b5
Time (start): Mon, 2022-04-25 10:12:57
Time (end):   Mon, 2022-04-25 10:15:47
Duration: 2 minutes 49.51 seconds
Number of files: 1016125
Utilization of max. archive size: 0%

Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

~Yes. It doesn't happen for all snapshots on very invocation as zfs seems to recycle inodes. But for each invocation it happens for 2 to 6 of the 10 snapshots I'm backupping.~

Include any warning/errors/backtraces from the system logs

ThomasWaldmann commented 2 years ago

Thanks for the interesting report. Yeah, guess just mounting the stuff beforehands is the best way to fix this.

Would be bad if we would have to add an additional stat for each directory just to work around this. And then maybe even have more timing issues if the mount takes some times and the next stat still would not yield the final/stable values.

ThomasWaldmann commented 2 years ago

Considering this likely happens with any "auto mounter", guess we should add this to docs / FAQ.

mwalliczek commented 2 years ago

I got the same error when migrating from borgbackup 1.17 to 1.21 with a cifs mount via autofs. After changing back to version 1.17 the error does not occur any more.

ThomasWaldmann commented 2 years ago

@mwalliczek that's because 1.1.x works a lot based on filenames and doesn't check much (so it's quite open to all sorts of race conditions, but also tolerant to autofs / automounter i guess).

1.2.x works based on fd (file descriptors) and makes pretty sure that it only opens what it intended to open and not something different suddenly appearing at the same place. good to avoid race conditions, bad for automounter.

can you just mount before running borg?

iansmirlis commented 1 year ago

Is it too much to disable this behavior, i.e use filenames instead of inodes, with a file system option during borg create?

ThomasWaldmann commented 1 year ago

@iansmirlis the problem is the magic behaviour of the mountpoint and the solution was already found, see above posts.

iansmirlis commented 1 year ago

@ThomasWaldmann, sure, thanks for clarification. See, my issue is that automount is there for a reason, and it behaves like this, magic or not, for a reason, too.

borg also has good reasons to work on fd, however in this case, I have to manually taking care mounting and unmounting snapshots, without any actual gain. i.e. I do not see a way to have a race condition on a read-only zfs snapshot.

Imho, it would be more convenient for me to have the option to disable this behavior, instead of manipulating mounting.

Having said that, I will not insist. You are far more experienced to see if this is a clean behavior.

logitab commented 1 year ago

I propose a command line switch to select the behavior of stat_update_check(st_old, st_curr). Automounting is a common technique. There shouldn't be a need to create a workaround to backup a common filesystem technique. I got the impression that this all is to increase security, but there are environments where security is not the first concern.

Can't the whole issue be solve by adding an option like --nofdcheck and placing a if statement within stat_update_check() or are there deeper implications?

fbicknel commented 1 year ago

My fix is to not rely on the automount feature, but instead mount the snapshot manually. I changed my backup script to mount all snapshots I want to backup into /mnt/zfs/<snapshot name> and the problem is fixed.

I'm so glad you posted this. Until I found this, I had no clue as to what was causing this.

I thought about it a little bit, and instead of going to the trouble of mounting to some other location, I tried the following -- and it worked:

            info "($TARGET) Mounting snapshot"
            cd ${TARGET} && cd -

All it took was a chdir to the target location (e.g., /.zfs/snapshot/today/var) and zfs mounted it for me.

laziness. that's the stuff.

ThomasWaldmann commented 1 year ago

@fbicknel thanks for adding that here.

maybe even a ls -d ${TARGET} would work (just one command and not changing the cwd)?

fbicknel commented 1 year ago

I did try an ls ${TARGET}, but that didn't work. --Not sure if I did something wrong there or what.-- EDIT: See posts below. A trailing / will allow this to work.

And if I had known GH was ignoring my markdown because I replied by email, I never would have done that. :) Frank Bicknell

On Thu, Jan 19, 2023 at 4:16 PM TW @.***> wrote:

@fbicknel https://github.com/fbicknel thanks for adding that here.

maybe even a ls -d ${TARGET} would work (just one command and not changing the cwd)?

— Reply to this email directly, view it on GitHub https://github.com/borgbackup/borg/issues/6652#issuecomment-1397620378, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMZGGFYWFK34GOX67WJOK3WTGVLRANCNFSM5UKUDFOQ . You are receiving this because you were mentioned.Message ID: @.***>

jdchristensen commented 1 year ago

@fbicknel What about ls ${TARGET}/, with a trailing slash?

fbicknel commented 1 year ago

I tried it. It works, too. So I guess take your choice.

gzagatti commented 1 year ago

I'd like to backup directly from the snapshot location ~/.zfs/snapshot/ because as a normal user I'm not allowed to mount or clone files.

Is it indeed the case that file inodes will be unstable when doing backups from say ~/.zfs/snapshot/my-snap1 and ~/.zfs/snapshot/my-snap2 even when doing a relative backup?

Is the only alternative then to use --file-cache ctime,size?

I'm particularly concerned about the answer to these two FAQs I am seeing ‘A’ (added) status for an unchanged file!? and It always chunks all my files, even unchanged ones!.

ThomasWaldmann commented 1 year ago

@gzagatti

ls -i shows the inode, so just check that
the full absolute path of a file must stay the same, because that is used as index into the files cache

gzagatti commented 1 year ago

@ThomasWaldmann

Perfect! I checked the inode of one single file in two different snapshots and they are the same. However, it is difficult to tell if it'll always be the case.

In any case, point 2 means that I should backup from the same path even when using relative backups as I was doing.

benurb commented 1 year ago

@fbicknel What about ls ${TARGET}/, with a trailing slash?

Just to add here why I didn't use that approach (or cd into the directory): I'm backing up snapshots of 10 zfs file systems. When I'm cd'ing into the snapshots before starting the backup, a few of them are already unmounted again when the backup process reaches them. Compared to the hassle of writing a script that constantly touches the mountpoints while borgbackup is running, mounting the snapshots manually seems like to more reasonable solution 😀

borgbackup / borg