Jwink3101 / syncrclone

Python-based bi-direction sync tool for rclone
MIT License
150 stars 13 forks source link

scheduling syncrclone #13

Closed janvanveldhuizen closed 2 years ago

janvanveldhuizen commented 2 years ago

Hi Justin,

I want to have a continuous sync by adding a cron job to run the sync at a regular interval, say every 30 minutes or so. What will happen if a previous sync is still running? Will it wait, or just skip?

Two other questions:

Thanks Jan

Jwink3101 commented 2 years ago

Hi,

Main Question

syncrclone has the built-in ability to do its own locking but if off by default. (see config). When this is set to True, it will write a lock file on both remotes. This means that if you run syncrclone again before the last one is finished, it will raise a LockedRemoteError which, in turn, will cause a non-zero exit.

This has the advantage of stopping any other machine using syncrclone from working at the same time on the same repo. But note that this is just a feature of syncrclone and won't actually stop other tools from writing to it.

As for if it will retry, it will just fail. You could write your own logic to retry.

May I make another suggestion? Rather than use cron to run every thirty minutes, why not write a small daemon-like script to run it and wait 30 minutes

while true; do
    syncrclone /path/to/config.py > /dev/null 2> &1 # use built in logging
    sleep 1800 #30 min
done

Then call that with

$ nohup myscript.sh &

This will wait 30 minutes between running starting at the last one finished.

Additional Questions

Do I need to clean up the backup folder myself on a regular basis

I am not sure what you mean by need. You can keep the backups as long as you want. But there is no built in pruning mechanism. I've made some one-off tools to do that but they are far from production. I may consider putting them in the repo as another tool you can use but I don't plan to build it in to syncrclone.

Are the logs rotating? Or do I need to do a cleanup now and then?

Rotating logs are more for an ever-running process. syncrclone makes a new log for every run. So the answer is the same as the above. You don't need to do anything but if you run it every 30 minutes, you will have 48 of them per day. You can clean them as you see fit.

Personally, I either delete them or, once a month (or so), I will run the following (either directly if it's local or on a mount if it's not):

$ zip -rm9 2021-10.zip name_2021-10*.log

but again, you don't need to do anything.

What you didn't ask: Modifications while sync is happening

You didn't ask this but it could come up if you're syncing in the background.

What happens if you modify or add or delete files while a sync is happening.

The general answer is that it depends on when and what. If it's while listing a file, it may get missed on either or both sides. If it's while rclone is syncing, it could cause it to break. If using avoid_relist, it can cause the info to not match.

With that said, I think that the majority of the time, it will be recovered on the next sync. I chose a sync algorithm that is more robust to what is happening currently than what it looked like before. That makes it more likely to handle this kind of thing.

But, I suggest trying to make sure you're not actively working while it syncs. This is not like Dropbox or OneDrive clients that run all of the time. I tend to use this by syncing when I start on a machine and then before I walk away. Occasionally throughout the day as another backup.


I hope this helps.

janvanveldhuizen commented 2 years ago

Thanks a lot! I am having a big (2Tb) external disk connected to a Raspberry Pi4. Using Samba, I use this as a shared drive in my home network. Works perfectly. I had the idea to sync this with my Dropbox account, but then I discovered that Dropbox is not available for ARM based computers. That's why I am looking into an alternative.

While I am typing this, I am thinking... why not moving this disk to a system where I can run Dropbox...

Jwink3101 commented 2 years ago

You have to do what works. I use my own tool over the native ones because my home computer is criminally underpowered and the OneDrive client doesn't work well. And for DropBox, I don't want the client running for one small directory.

janvanveldhuizen commented 2 years ago

In my case, it is not a small directory. I have tens of thousands of files, up to 1.2 Tb. Anyway, thanks for replying. It's food for thought.

Jwink3101 commented 2 years ago

For what it's worth, I use it for 200,000 files but only 300gb. But the difficulty is number of files, not how big they are, at least from a code efficiency point