Closed cmarks-hivelocity closed 6 years ago
the -r/--rsync are the rsync options that will be used by the spawned rsync processes:
rsync options:
-r, --rsync ... MUST be last option. rsync options as a quoted string ["-aS --numeric-ids"]. The "--from0 --files-from=... --quiet --verbose --stats --log-file=..." options will ALWAYS be added, no matter what. Be aware that this will affect
all rsync *from/filter files if you want to use them. See rsync(1) manpage for details.
Did you already try the exclude/include options from rsync ? Did it work incorrectly ?
--exclude=PATTERN exclude files matching PATTERN
--exclude-from=FILE read exclude patterns from FILE
--include=PATTERN don't exclude files matching PATTERN
--include-from=FILE read include patterns from FILE
I'm doing a simple test: ./msrsync test1/ test2/ -r"--exclude=b"
but the folder b is still included and synced to the folder test2
This command works for me:
$ mkdir -p test1/b test2
$ ./msrsync -r "--exclude b --exclude b/**" test1/ test2/
$ find test2
test2
Have a look at this answer on stackoverflow: https://stackoverflow.com/a/41876294.
msrsync should provide it's own exclude
option - OR - dynamically translate what user intended to exclude to a exclude that works with files-from
the point is what works in simple rsync invocation should also work in msrsync. i also tried with exclude something/** and still getting crash
I understand that it is not convenient as the but that's how exclude
and files-from
are working together. It is exactly the same behavior in simple rsync (as the stackoverflow link shows) : you have to exclude something
AND something/**
.
Regarding the crash you mentionned, I guess you are refering to your issue https://github.com/jbd/msrsync/issues/12 ? It should be corrected in https://github.com/jbd/msrsync/commit/e2368315eee08df6d55a86978b411e7b97798f90.
A msrsync exclude
option is a good idea. It's not logical to walk part of a tree you'll exclude in the spawned rsync. But I'm afraid that as soon as I'll try to implement this, I will need to add regular expression support and what not. For the moment, I'll stick with the rsync exclude option.
@jbd I decided to give a shot at implementing exclude at the msrsync level because we had a few crawls that were going over backups/snapshots that ultimately resulted in an inflated file list, much of which rsync would exclude anyway (but would take a long time to go over). So here's a gist of my os.walk
wrapper.
My solution is more basic than rsync exclude, but it meets our needs and uses built in libraries (python 2.6 compatible), fnmatch and re. fnmatch
utilizes Unix shell-style wildcards. The big caveat is exclusions are only per filename/dirname and does not care about the full path.
It uses fnmatch.translate to generate regular expressions and compiles two lists (one for files and one for directories) of regular expressions into a pattern, each called once per os.walk
iteration.
This utilizes the built in functionality to modify the os.walk
dirnames list in-place, which causes it to skip anything excluded. It probably doesn't need to modify the files list in place and could just return it, but I only just now thought about it :).
https://gist.github.com/agates/51db90f77ea1a8f658906a94f9161d4a
Lastly, thank you very much for this tool. We have used it to synchronize at least a petabyte.
Hello,
thank you for taking the time reporting and writing this. I understand the problem and the inconvience of the rsync exclude mechanisms.
Right now, I don't have time to integrate this inside the project butI've implemented something very naive some time ago in the "exclude" branch (Oct. 2018) :
https://github.com/jbd/msrsync/blob/exclude/msrsync#L512
It's very basic and not as flexible as your proposal but maybe it could help remove .snapshot directories and whatnot right now.
I'll try to have a look at this and other things in the future.
late and prob a dumb Q:
i run this
rsync -av --progress --delete --exclude 'snapraid.content' --exclude 'snapraid.content.lock' --exclude 'aquota.group' --exclude 'aquota.user' --exclude 'lost+found/' --exclude 'homedir/Malcolm/Borgbackup/' --exclude 'homedir/Bo/iTunes/' /srv/mergerfs/Data/ root@100.126.215.89:/srv/mergerfs/Data/homedir/bo/Borgbackup/Backup-Bo-OMV_NAP/
works fine but how would that look like with msrsync ?
It is possible to use excludes to avoid syncing certain sub folders as a part of the -r arguments?