vsoch / watchme

Reproducible watchers for research
https://vsoch.github.io/watchme/
Mozilla Public License 2.0
851 stars 32 forks source link

Reducing risk of data loss when using `watchme` on a server #63

Closed abitrolly closed 4 years ago

abitrolly commented 4 years ago

What is your question?

I want to grab the coronapocalyse stats from my country site every day. Because my notebook can be offline on certain days, I am looking for some hosting like Heroku. The problem is that Heroku instances are restarted from time to time and I am afraid to lose the data.

It may worth mentioning in README.md that watchme runs in user session and won't survive reboot - cron jobs will start again only when user is logged in. For reliable server-side service, more steps are needed.

I am not ready to provide a solution, but I can provide common problems.

  1. After reboot scheduler (which is cron) won't start.
  2. Server was shutdown and scheduler missed the time period, not warning about it.
  3. Server rebooted after the task was executed, but before the task was completed. No data recorded, no notification is being made.
  4. Tasks executed, but did not record any data. No notification is being made/

It is helpful to know how watchme can help to reduce these risks.

If the documentation doesn't answer your question, where do you think it would be appropriate to add (i.e., where did you go looking for it?)

Short README.md notice about watchme limitations would be enough.

vsoch commented 4 years ago

The use of cron is outside of the scope of watchme, but if it would be helpful to you, I'd be happy to add a few notes to the README. Do you have other ideas for what could trigger watchme, aside from cron that you'd like to look into?

vsoch commented 4 years ago

I added a "limitations" section to the README so this is known off the bat. if you think of another way to schedule tasks aside from cron, definitely let's chat about if we can support it.

abitrolly commented 4 years ago

cron is not that bad as I thought. On modern Linux it always available and works even when user is logged out. I confused its behaviour with systemd timers. Timers won't work outside user session without special configuration. Edited my post to cross out cron reboot recovert from common problems.

https://unix.stackexchange.com/questions/292913/run-users-systemd-timer-while-they-dont-have-any-open-session

I've heard that anacron can recover missing job. And Airflow, but the latter is too complicated.