Closed jemrobinson closed 5 years ago
dd if=/dev/urandom of=10M.txt bs=1M count=10
dd if=/dev/urandom of=100M.txt bs=1M count=100
dd if=/dev/urandom of=1G.txt bs=1M count=1024
dd if=/dev/urandom of=10G.txt bs=1M count=10240
time rsync -prtlzv --delete --progress /datadrive/mirrordaemon/transfer_tests/10M.txt mirrordaemon@10.1.0.20:/datadrive/mirrordaemon/transfer_tests
sending incremental file list
10M.txt
10,485,760 100% 26.10MB/s 0:00:00 (xfr#1, to-chk=0/1)
sent 10,491,816 bytes received 35 bytes 1,907,609.27 bytes/sec
total size is 10,485,760 speedup is 1.00
real 0m5.234s
user 0m0.354s
sys 0m0.027s
Size | Time |
---|---|
10MB | 5.2s |
100MB | 6.5s |
1GB | 1m |
10GB | 10m |
for i in $(seq 1 1000); do
dd if=/dev/urandom of=$i.txt bs=1M count=1
done
time rsync -prtlzv --delete --progress /datadrive/mirrordaemon/transfer_tests/* mirrordaemon@10.1.0.20:/datadrive/mirrordaemon/transfer_tests
sent 1,049,224,326 bytes received 633,452 bytes 29,573,458.54 bytes/sec
total size is 1,048,576,000 speedup is 1.00
real 0m35.480s
user 0m35.187s
sys 0m3.733s
top
while transferring a 10GB fileTasks: 140 total, 3 running, 70 sleeping, 0 stopped, 0 zombie
%Cpu(s): 21.7 us, 0.8 sy, 0.0 ni, 77.3 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
KiB Mem : 8166660 total, 179232 free, 367624 used, 7619804 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 7484420 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
866 mirrord+ 20 0 14428 2848 2132 R 76.1 0.0 0:18.98 rsync
867 mirrord+ 20 0 47060 5460 4788 R 15.3 0.1 0:03.03 ssh
top
while transferring two 10GB filesTasks: 141 total, 2 running, 71 sleeping, 0 stopped, 0 zombie
%Cpu(s): 28.5 us, 1.1 sy, 0.0 ni, 69.8 id, 0.0 wa, 0.0 hi, 0.7 si, 0.0 st
KiB Mem : 8166660 total, 166624 free, 371784 used, 7628252 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 7480260 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
756 mirrord+ 20 0 14428 2936 2236 R 100.0 0.0 0:10.04 rsync
757 mirrord+ 20 0 49320 8260 5196 S 18.1 0.1 0:01.92 ssh
762 mirrord+ 20 0 44532 3968 3356 R 2.9 0.0 0:00.19 top
MirrorVMExternalPyPI
to F72s_v2
with 72 vCPUs and 144GB RAM.top
while transferring two 10GB filesTasks: 697 total, 1 running, 346 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.6 us, 0.0 sy, 0.0 ni, 99.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 14852920+total, 69836416 free, 2755740 used, 75937040 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 14461584+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3499 mirrord+ 20 0 14428 3020 2312 S 37.4 0.0 0:10.77 rsync
3500 mirrord+ 20 0 47200 6184 5288 S 5.0 0.0 0:01.46 ssh
remove the z
option from rsync
> time rsync -prtlv --delete --progress /datadrive/mirrordaemon/transfer_tests/10G*.txt mirrordaemon@10.1.0.20:/datadrive/mirrordaemon/transfer_tests
sending incremental file list
10G.txt
702,447,616 6% 223.44MB/s 0:00:43
also disable SSH compression
> time rsync -prtlv --delete --progress /datadrive/mirrordaemon/transfer_tests/10G*.txt mirrordaemon@10.1.0.20:/datadrive/mirrordaemon/transfer_tests
sending incremental file list
10G.txt
1,198,456,832 11% 145.75MB/s 0:01:03
From here
rsync will only copy one chunk of data at a time.
@jemrobinson How confident are we that the bandersnatch synchronisation from the public PyPI site to the external mirror is complete?
If it is more reliable than rsync, then could we also use it for our external->internal mirror sync? I note that bandersnatch also supports a blacklist and whitelist, which could be great for our Tier 3 mirrors.
However, bandersnatch looks like it is designed to be run from the internal mirror in our arrangement, which we would not want.
I think that a way to incorporate bandersnatch could be for the Tier3 external mirror to be populated from the Tier2 mirror using bandersnatch's whitelist, but for the push to the internal mirror to remain the same. This would involve requiring that there's always a Tier2 mirror (or perhaps we could call this a "full mirror") available, whenever there's Tier3 data, which might be overkill.
I like this idea a lot 👍
I think it works for the likely majority use case where one SHM supports a range of DSGs.
rsync
failed twice when pushing updates fromExternalPyPI
toInternalPyPI
.InternalPyPI
that we could not diagnoseIs it possible to catch and recover from these types of error?
NB. the PyPI mirror is much bigger than the CRAN one, so errors are more likely to affect it