Closed Ar2000jp closed 9 years ago
I just remembered that there's a cron.d folder. I totally forgot about that. I'll modify the cron_datasync role to use that soon. Sorry for the inconvenience.
I haven't tried running this yet, but I have reviewed the code and this looks like very nice work! There are a couple of changes I think you will need, though:
syncscript.sh
should check for a connection, and exit silently if it doesn't find one. You can check for a connection by running scripts/has_internet
, for example.datasync_rsync__data_dir
value in defaults/main.yml
is just an example, but wouldn't this cause the cron job to try and rsync from a non-existent machine every time it runs (unless the default it overridden)? If so, it might be better to have these examples as comments rather than actual varsAre you working on setting up an rsync server in Jordan to host the wikipedia, kalite, etc content? If let me know and I'll put that on my TODO list.
Thanks for taking the initiative to write this!
Good point about the default variables. I just wanted to give an example, since it's a bit complicated. I'll fix it right away.
Rsync can handle connection problems fine by itself. I tested it with the rsync server being down, and the connection being down. As for the chmod and chown statements, I think they should be executed everytime, since rsync might have been cut off halfway, and of course we don't want the wrong permissions for our data dirs.
I forgot to mention this in the original comment. I tuned rsync's parameters to make the sync as atomic as possible. And I used the rsync protocol to lower the overhead, and avoid the complexities (and fragility) of distributing a host key, and a read only private key. Also, I used chown and chmod instead of the internal rsync arguments because those can be tricky to handle, and they're only available in more recent versions. E.g. they're not available on my test machine, which is Ubuntu 12.04.
Ok, that all sounds good. What about setting up the rsync server? Does QRF have one we can use (if so, get me access to it and I can start putting the wikipedia, kalite, etc content there, unless you've already got that data.
Also, I suggest adding allow_duplicates: yes
to datasync_rsync/meta/main.yml
. This way when other roles need to install an rsync scripts, they can call datasync_rsync as a dependency (without this setting only the first call will actually do anything).
Although my testing shows that it's working without allow_duplicates, I went ahead and added it anyway. Maybe allow_duplicates is for when the input parameters are the same?
About the server, I'll talk to Ali about it.
Implementation for issue #35 using rsync and cron.