RsyncProject / rsync

An open source utility that provides fast incremental file transfer. It also has useful features for backup and restore operations among many other use cases.
https://rsync.samba.org
Other
2.82k stars 332 forks source link

rsync -u --inplace --partial -a can't resume transfer #236

Open nahuel opened 3 years ago

nahuel commented 3 years ago

Suppose you have a file in hostA:

  hostA$ ls -l /tmp/files
  -rw-rw-r-- 2 root root 563016 Jan 10 15:01 test.txt

You download it from hostB using:

  hostB$ rsync -u --inplace --partial -a hostA::files/* .

If the transfer is aborted, hostB will get only a partial file:

  hostB$ ls -l /tmp/files
  -rw-rw-r-- 2 root root 2024 Jan 11 18:00 test.txt

BUT the ctime/mtime of hostB/test.txt now is NEWER than hostA/test.txt (and mtime == ctime). So, if you run the same rsync -u command again:

  hostB$ rsync -u --inplace --partial -a hostA::files/* .

Rsync will SKIP THE FILE, because hostB/test.txt is "newer" than hostA/test.txt. So you CAN'T resume using rsync -u command, and you will think there are no differences.

Note this is really need because there are scenarios where checksum comparison can't be used, only comparison by time. For example, to avoid deleting changes made in hostB to test.txt. Also I need to use --inplace.

A reproducible test:

~$ cd /tmp/
/tmp$ mkdir a b
/tmp$ cd a
/tmp/a$ head -c 100000 /dev/urandom > test
/tmp/a$ ls -l test
-rw-r--r-- 1 nahuel nahuel 100000 Jan 27 18:27 test
/tmp/a$ cd ../b
/tmp/b$ timeout 3 rsync -u --inplace --partial --bwlimit=2k --progress -va ../a/test .
sending incremental file list
test
         32,768 32% 0.00kB/s 0:00:00
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(632) [sender=3.1.1]
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at io.c(504) [generator=3.1.1]
/tmp/b$ ls -l test
-rw------- 1 nahuel nahuel 0 Jan 27 18:28 test
/tmp/b$ timeout 3 rsync -u --inplace --partial --bwlimit=2k --progress -va ../a/test .
sending incremental file list

sent 59 bytes received 12 bytes 142.00 bytes/sec
total size is 100,000 speedup is 1,408.45
/tmp/b$ ls -l test
-rw------- 1 nahuel nahuel 0 Jan 27 18:28 test
/tmp/b$

To avoid this bug, rsync must create the file with ctime=mtime=0. And if the file already exists before transfer, rsync -u must not change his current ctime/mtime. The values of ctime/mtime must be updated ONLY after the transfer was successfully completed. But researching more, I see POSIX has NO way to disable mtime updating while calling write()'s, so there is no way to atomically leave a partial file with an mtime=0 mark while using --inplace.

rsync --update --partial (no --inplace flag) can do it because it transfers first to a temporal file, then updates his mtime (to 0 if unsuccessful, or to the original file mtime if transfer completed), and then rename()s the file.

So I think this can't be solved, but a warning should be placed in the rsync manpage/CLI about this -u --partial unexpected behavior.

WayneD commented 2 years ago

I'm adding a warning about this to the man page for now, as suggested. In the future I will consider having the receiving side back-date an interrupted in-place file to an mtime slightly less than the source file's mtime, and probably only if --update was used.