RsyncProject / rsync

An open source utility that provides fast incremental file transfer. It also has useful features for backup and restore operations among many other use cases.
https://rsync.samba.org
Other
2.79k stars 332 forks source link

Does rsync provide atomic/ACID guarantees? #519

Open safinaskar opened 1 year ago

safinaskar commented 1 year ago

Does rsync provide atomic/ACID guarantees? Let me describe what I mean. Suppose I started rsync using rsync --partial --timeout=... ..., then for any reason rsync was terminated (for example, Ctrl-C, SIGKILL, internet is down, IP address is changed, suspend caused internet problems, etc). Then I start rsync again with same options and hopefully rsync continues. Then it reaches end. I want to have guarantee that:

In other words, I want to have guarantee that repeated rsync --partial ... is consistent/atomic/ACID/whatever. And such guarantees should be properly documented.

It seems that currently such guarantees don't hold. For example, in one particular situation rsync starts second transfer from scratch: https://github.com/WayneD/rsync/issues/330 (note that this bug is always reproducible). Also, once I noticed that after multiple rsync --partial commands there are some files .file.Z5oY3j left

realsimix commented 1 year ago

I think for what you want you have to add the --partial-dir option.

Regards, Simon

safinaskar commented 1 year ago

@realsimix, the bug is still reproducible (but not always) with --partial-dir with exact instructions at https://github.com/WayneD/rsync/issues/330 with rsync 3.2.7 at both sides.

After restart rsync starts from scratch writing output to .ur.J7rnqK, ignoring already partially written data in partial-dir.

In fact, https://github.com/WayneD/rsync/issues/330 is not so important. What I actually need is guarantee that file is transferred correctly after end of loop (even if weird behavior described in https://github.com/WayneD/rsync/issues/330 is present).

What I want is a checklist of things I need to keep in mind to make sure transfer is always correct similar to https://www.sqlite.org/howtocorrupt.html

realsimix commented 1 year ago

It seems to work for me. Running with --partial --partial-dir=./tmp works different with my tested rsync-3.0.6 and rsync-3.3.0-pre1 but the result is correct in both cases: transfer is picked up at the already transferred state.

I think your problem is something else: If your rsync transfer stops for whatever reason, how does the remote side know it? If you terminate the local rsync and after one second restart it again, it's quite possible that the remote instance did not terminate yet and you end up in troubles when you fire up a new instance too fast.

safinaskar commented 1 year ago

@realsimix , thanks for quick answer! Keep in mind that I terminate rsync not by Ctrl-C or any other means. I disconnect internet (intentionally to reproduce bug, i. e. to simulate conditions, which could actually happen in production), then wait for rsync to terminate by timeout. And sometimes this reproduces that bug.

If you terminate the local rsync and after one second restart it again, it's quite possible that the remote instance did not terminate yet and you end up in troubles when you fire up a new instance too fast

Yes, it is quite possible this is exactly what happens. I disconnect internet, and thus it is quite possible remote side doesn't know that local rsync is terminated.

But this problem should be documented! This is what I'm trying to say. I. e. docs should say: "don't do so and so or you will be in trouble"