nickrsan / icefish_backend

0 stars 0 forks source link

Need alert when realtime data collection fails #8

Open nickrsan opened 6 years ago

nickrsan commented 6 years ago

Original report by Nicholas Santos (Bitbucket: nickrsan, GitHub: nickrsan).


It'd be good to have a status monitor script of some sort that checks that there's a new record in the database every so often - it'd be good to have this not be the same as the script that actually checks for CTD data so that if that one fails, we still get an alert

This script should also be run by the MOOTasks user - that way, if it detects it's gone more than X (3x sampling interval??) time since a record has been written to the database, it:

  1. Checks if the failure flag is already set or if it's been more than a day since the failure flag was set - if not, send an email alerting that data collection isn't occurring.
  2. Sets a failure flag - use a datetime object as flag to facilitate check
  3. Restart the CTD monitoring service
  4. Sleep and check again later
nickrsan commented 6 years ago

Original comment by Nicholas Santos (Bitbucket: nickrsan, GitHub: nickrsan).


Just a note that it should only restart the service once, maybe twice - or hourly while failing or something like that - don't want to do it repeatedly if it's not working.