cjnaz / rclonesync-V2

A Bidirectional Cloud Sync Utility using rclone
MIT License
355 stars 39 forks source link

Sync fails when files being naively written to #24

Open mjg0 opened 5 years ago

mjg0 commented 5 years ago

Background: I'm trying to set up rclonesync.py for an HPC environment, meaning that good I/O practices can't be guaranteed. Users who don't know how to avoid hammering files, or who of necessity use programs that are not designed for HPC/don't handle I/O properly, could be writing to files many times per second at sync time.

When a file is being constantly written to, though, running rclonesync.py fails hard enough that it must be rerun with --first-sync before it can be run normally again. I can understand why it would fail--a file that's getting hammered isn't a good candidate for syncing--but is there anything preventing the failure from being graceful, i.e. allowing the next try to succeed without requiring --first-sync? I know that bidirectional syncing is a hard problem, so I may be missing something obvious that means one simply has to be careful not to engage in foolishness like for i in {1..99999}; do echo $i >> myfile; done while syncing. However, if there is a way to more gracefully handle such cases, it would mean a more robust rclonesync.py.

The following case illustrates my point. In essence, I run rclonesync.py while a local file is being written to 100 times per second, and cause a critical error abort:

$ rclone mkdir dropbox:rclonesync_testdir
$ mkdir rclonesync_testdir
$ cd rclonesync_testdir/
$ echo "file 1" > file1.txt
$ rclone copy file1.txt dropbox:rclonesync_testdir
$ rclonesync.py . dropbox:rclonesync_testdir --first-sync # works
2019-05-03 16:22:24,816:  ***** BiDirectional Sync for Cloud Services using rclone *****
2019-05-03 16:22:24,845:  Lock file created: </tmp/rclonesync_LOCK_._dropbox__rclonesync_testdir_>
2019-05-03 16:22:24,846:  Synching Path1  <./>  with Path2  <dropbox:/rclonesync_testdir/>
2019-05-03 16:22:24,846:  Command line:  <Namespace(Path1='.', Path2='dropbox:rclonesync_testdir', check_access=False, check_filename='RCLONE_TEST', config=None, dry_run=False, filters_file=None, first_sync=True, force=False, max_deletes=50, no_datetime_log=False, rc_verbose=None, rclone='rclone', rclone_args=None, remove_empty_directories=False, verbose=False, workdir='/fslhome/micgre93/.rclonesyncwd')>
2019-05-03 16:22:24,846:  >>>>> --first-sync copying any unique Path2 files to Path1
2019-05-03 16:22:26,280:  >>>>> Path1 Checking for Diffs
2019-05-03 16:22:26,280:  >>>>> Path2 Checking for Diffs
2019-05-03 16:22:26,280:  >>>>> No changes on Path2 - Skipping ahead
2019-05-03 16:22:26,280:  >>>>> Synching Path1 to Path2
2019-05-03 16:22:26,876:  >>>>> Refreshing Path1 and Path2 lsl files
2019-05-03 16:22:27,494:  Lock file removed: </tmp/rclonesync_LOCK_._dropbox__rclonesync_testdir_>
2019-05-03 16:22:27,494:  >>>>> Successful run.  All done.
$
$
$ for i in {1..1000}; do sleep 0.01; echo $i >> file2.txt; done & rclonesync.py . dropbox:rclonesync_testdir
[1] 84345
2019-05-03 16:23:31,004:  ***** BiDirectional Sync for Cloud Services using rclone *****
2019-05-03 16:23:31,037:  Synching Path1  <./>  with Path2  <dropbox:/rclonesync_testdir/>
2019-05-03 16:23:32,648:       1 file change(s) on Path1:    1 new,    0 newer,    0 older,    0 deleted
2019/05/03 16:23:33 ERROR : file2.txt: Failed to copy: upload failed: Post https://content.dropboxapi.com/2/files/upload: can't copy - source file is being updated (size changed from 288 to 380)
2019/05/03 16:23:33 ERROR : Dropbox root 'rclonesync_testdir': not deleting files as there were IO errors
2019/05/03 16:23:33 ERROR : Dropbox root 'rclonesync_testdir': not deleting directories as there were IO errors
2019/05/03 16:23:33 ERROR : Attempt 1/3 failed with 2 errors and: upload failed: Post https://content.dropboxapi.com/2/files/upload: can't copy - source file is being updated (size changed from 288 to 380)
...
[truncated for brevity]
...
2019/05/03 16:23:36 Failed to sync: upload failed: Post https://content.dropboxapi.com/2/files/upload: can't copy - source file is being updated (size changed from 1288 to 1364)
2019-05-03 16:23:36,513:    ERROR    rclone sync failed.  (Line 473)     - ./
2019-05-03 16:23:36,513:  ***** Critical Error Abort - Must run --first-sync to recover.  See README.md *****

Is this fixable? If not, do you have any suggestions on how to handle cases like this? Since adding --first-sync does make the issue at least superficially go away (at least after the file hammering is finished), what would be the implications of using --first-sync every time rclonesync.py is run?

cjnaz commented 5 years ago

Interesting problem. I'll think about if its safe to simply skip a file that fails the rclone transfer. For the individual file copy from path2 to path1 I know which file failed, but for the final rclone sync from path1 to path2 it seems unsafe. Hum... Also see issue #8.

Note that there is a risk of data loss with --first-sync. Newer versions of files on Path2 will be lost. Be careful.

I'm out of town for a bit. Perhaps next week I'll get back to this. Regards

cjnaz commented 5 years ago

Perhaps you could have such files to the filters file and entirely ignore them?