martymac / fpart

Sort files and pack them into partitions
https://www.fpart.org/
BSD 2-Clause "Simplified" License
231 stars 39 forks source link

fpsync: add -T option to specify absolute path for the copy tool. #42

Closed jbd closed 2 years ago

jbd commented 2 years ago

This patch add a -T option to specify the absolute path of the copy tool.

You can use it to use a specially compiled copy tool, a wrapper (to submit jobs on a HPC cluster) or simply a copy tool not in the PATH.

I'm using it to wrap "rsync" with a container that bind mount a randomly chosen mountpoint to the same server:/export.

For example, I have 16 mountpoints to the same server:/mnt/src directory (or destination directory) that I want to synchronize to some destination (that could also use the same pattern)

/mnt/src00
/mnt/src01
...
/mnt/src15

And I've got an rsync wrapper that is using apptainer (http://apptainer.org/) as an indirection to select a random srcXX directory as source. Something like:

NUM_SRC=$(printf "%02d" $((RANDOM%16)))
SRC_MNT=/mnt/src${NUM_SRC}
NUM_DST=$(printf "%02d" $((RANDOM%8)))
DST_MNT=/mnt/dst${NUM_DST}
exec $APPTAINER_BIN exec -e -B $SRC_MNT:$SRC -B $DST_MNT:$DST  "$RSYNC_CONTAINER" /bin/rsync "$@"

And I run: fpsync -n 64 -m rsync -T rsync_apptainer.sh -d fpsync_shared -t fpsync_temp /mnt/src /mnt/dst/

Some underlying hardware/filer/nas architecture with multiple (nfs) head nodes in front of a parallel filesystem can leverage this sort of parallelism. Of course, your mileage may vary.

martymac commented 2 years ago

Committed, with minor changes. Thanks a lot for your contribution!