backuppc / rsync-bpc

Rsync-bpc is a customized version of rsync that is used as part of BackupPC
GNU General Public License v3.0
28 stars 17 forks source link

refCount needFsck close(fd) + bpc_poolWrite_addToPool fixes #32

Closed ovidiustanila closed 11 months ago

ovidiustanila commented 2 years ago

We had some problems with a prior backup (out of disk space) and then we started hitting a lot of errors on the next trigger, like: G bpc_attribCache_dirWrite: failed to write attributes for dir f%2f/[path]/f239589/fthumb/attrib G bpc_attrib_dirWrite: rename from /data/BackupPC/pool/46/e8/46e80cf0e3b683b5f82d951a37ae7037 to /data/BackupPC/pc/[host]/368/f%2f/[path]/f239589/attrib_46e80cf0e3b683b5f82d951a37ae7037 failed G bpc_attribCache_dirWrite: failed to write attributes for dir f%2f/[path]/f239589/attrib G bpc_attrib_dirWrite: can't open/create raw /data/BackupPC/pool/f4/cc/f5cdcfe2b35f182d10d3b371335c9880 for writing G bpc_attribCache_dirWrite: failed to write attributes for dir f%2f/[path]/f239689/attrib

After some digging around found that those were caused by our file count limits: $ prlimit -p28051 -n RESOURCE DESCRIPTION SOFT HARD UNITS NOFILE max number of open files 1024 4096

and rsync_bpc had a lot of open files and that caused all kind of different errors when pooling;

$ lsof -p 28051 | grep -Po '/.*needFsck[0-9]' | sort | uniq -c 333 /data/BackupPC/pc/[host]/367/refCnt/needFsck1 940 /data/BackupPC/pc/[host]/368/refCnt/needFsck1

We've increased those limits to get things going and patched rsync_bpc to close the file to avoid this in the future. We'll do a complete fsck to get rid of what errors were added during this problem and re-trigger a full backup for all hosts, hopefully that will get rid of most of the errors.

Cheers, Ovidiu

ovidiustanila commented 2 years ago

Found that a bunch of missing attribs were related to the failure to copy attrib across filesystems (we split the pool on two disks due to a disk size limitation) rename didn't default to a regular file copy and resulted in temporary files in BackupPC/pool and missing attrib files. bpc_poolWrite_addToPool: replacing empty pool file

These changes worked for us and got rid of this scenario.

craigbarratt commented 11 months ago

Thanks for the PR!