borgbackup / borg

Deduplicating archiver with compression and authenticated encryption.
https://www.borgbackup.org/
Other
11.25k stars 745 forks source link

borg create: try to speed up unchanged file processing #8552

Open ThomasWaldmann opened 6 days ago

ThomasWaldmann commented 6 days ago

Early borg used to work filepath-based to deal with the source files (stat, open, read, xattrs, acls, fsflags).

That was problematic due to race conditions and thus was changed in borg 1.2 to use open() to get a file descriptor and then work with the fd everywhere possible, so we can be sure to always deal with the same fs object independently of its path.

But: fs api calls and especially open() can be rather slow for some filesystems, like network filesystems.

So, for an unchanged file (files cache hit), it currently does:

Review the code if it can be modified for the unchanged file case, so that the open and fstat call is not needed, without causing issues like re-introducing races.

ThomasWaldmann commented 6 days ago

Related:

ThomasWaldmann commented 5 days ago

With default borg options, it can not be sped up while retaining its consistency properties, because even with a files cache hit (== no need to read the file's content), borg still needs to read:

We can only be sure they all refer to same fs object, if we open the file and work based on the fd to do all these syscalls.

So, guess the only mode when it could be accelerated is with --noxattrs --noacls --noflags .... Then we could just use the name-based stat values and the cached chunkids from the files cache hit and not open the file because there would be no need for further fd-based file operations.

Check patch there: https://github.com/borgbackup/borg/issues/4498#issuecomment-1221432167