Without re-raising the error in the main loop, it is possible for a watcher to
stop without the process exiting, for example:
E, [2014-01-17T20:30:37.340407 #26185] ERROR -- Nerve::ServiceWatcher: nerve: error in service watcher xweb: response for meth: :exists, args: [0, "/nerve/services/xweb", nil, nil], not received within 30 seconds
I, [2014-01-17T20:30:37.340516 #26185] INFO -- Nerve::ServiceWatcher: nerve: ending service watch xweb
<...and then 48 hours of radio silence until I happen to notice and restart nerve.rb...>
I, [2014-01-19T21:15:09.840493 #17382] INFO -- Nerve::Nerve: nerve: starting up!
I, [2014-01-19T21:15:09.840664 #17382] INFO -- Nerve::Nerve: nerve: starting run
Ideally we'd just re-start the watcher itself, but this will at least ensure that nerve
never hangs around with a stopped watcher, and as long as you run under some reasonable
process management framework (upstart, daemontools, systemd) you'll come around and try
again a few moments later.
Without re-raising the error in the main loop, it is possible for a watcher to stop without the process exiting, for example:
E, [2014-01-17T20:30:37.340407 #26185] ERROR -- Nerve::ServiceWatcher: nerve: error in service watcher xweb: response for meth: :exists, args: [0, "/nerve/services/xweb", nil, nil], not received within 30 seconds I, [2014-01-17T20:30:37.340516 #26185] INFO -- Nerve::ServiceWatcher: nerve: ending service watch xweb <...and then 48 hours of radio silence until I happen to notice and restart nerve.rb...> I, [2014-01-19T21:15:09.840493 #17382] INFO -- Nerve::Nerve: nerve: starting up! I, [2014-01-19T21:15:09.840664 #17382] INFO -- Nerve::Nerve: nerve: starting run
Ideally we'd just re-start the watcher itself, but this will at least ensure that nerve never hangs around with a stopped watcher, and as long as you run under some reasonable process management framework (upstart, daemontools, systemd) you'll come around and try again a few moments later.