Open ghost opened 10 years ago
An option to disable xattr and acl support could also potentially help with performance. Maybe by a unnoticeable fraction even over a large backup, but when checking the file cache it would remove the need for 3 additional system calls on top of the lstat. Tools like rdiff-backup (which I was using before and am trialling attic currently) support this.
I might consider that if the performance impact can be shown to be large enough. Are you able to provide some initial benchmarks? (Manually disabling acl an xattr should be fairly easy)
I did some synthetic benchmarks traversing a directory and performing it with and without xattr/get_acl, and I would admit it makes little impact on this machine (which has relatively fast cpu/io) - although I need to test on a slower one with a larger data set. Also if a low priority background task on a server with other stuff doing io, it might make more of a difference.
walking through a structure of ~360k files with just lstat
# echo 3 > /proc/sys/vm/drop_caches; python3 test.py
files= 363291 time= 73.48496890068054
and with xattr_getall and acl_get also
# echo 3 > /proc/sys/vm/drop_caches; python3 test.py
files= 363291 time= 80.05171489715576
import xattr, os, time
from attic.platform import acl_get, acl_set
start = time.time()
path = '/something/';
def dostat(file):
item = {}
st = os.lstat(file)
xattrs = xattr.get_all(file, follow_symlinks=False)
acl = acl_get(file, item, 0)
c = 0
for dirname, dirnames, filenames in os.walk(path):
for filename in filenames:
file = os.path.join(dirname, filename)
dostat(file)
c+= 1
now = time.time() - start
print("files=", c, "time= ", now)
I in no way consider this a good/worthy benchmark ;-)
And what were your thoughts regarding the inode option ?
That option might be useful is some situations I guess, could you give some more details on why you would need this?
I don't actually need it currently, but I wanted to mention it is offered by other solutions, and if someone wanted to backup a network filesystem that has non static inodes it would be needed for the cache to be any use.
it would theoretically allow the cache to be rebuilt from a remote repository too if a user did not want to use inode comparison (as you said the fact that inodes are not stored in the repository is why this would not be possible to implement)
in regards to the xattr/get_acl I just think it's always a good idea to reduce overhead per file if possible (and if the user doesn't need them, why make an additional 3 million calls etc)
In PR #235 there is a dummy xattr and acl implementation now (used for unsupported platforms right now).
I was thinking about this the other day when looking at the code - and the comment at end of #88 reminded me:
I think it would be worthwhile to have options to disable inode checking, as well as acl and xattr support.
regarding inodes (from rdiff-backup docs)
An option to disable xattr and acl support could also potentially help with performance. Maybe by a unnoticeable fraction even over a large backup, but when checking the file cache it would remove the need for 3 additional system calls on top of the lstat. Tools like rdiff-backup (which I was using before and am trialling attic currently) support this.