Closed hartwork closed 6 years ago
Are the update_service
jobs getting queued? I haven't observed this behavior so guess it might be something to do with jobs getting stuck or having a big backlog. Would need more information to diagnose.
A restart of all docker containers fixed the symptom. Not sure where it got stuck. The observed behavior showed me that: Manually running checks does not update related instances and services — I think that's a bug? — and there is no way to "see" that checks are not running without explicit suspicion or a second instance of Cabot for mutual monitoring. For noticing of checks not running maybe a display with "most recent check run m:ss minutes ago" somewhere could help. The manual-trigger thing seems more important (and easier to fix) to me, though. What do you think?
Services update asynchronously so that would explain the behaviour you saw.
I see. I think I see a reason now why maybe you did want to do it asynchronously — that update action doesn't seem to scale well so I imagine it could take significantly longer in medium-size setups. Is that the reasoning for being asynchronous?
well it can potentially update a lot of different services/instances, and there's no real benefit to it being synchronous (except potentially in the scenario where you trigger manually)
Alright, thanks.
Hi!
Now that #610 is no longer keeping my HTTP checks red and all my checks are green for minutes, instances and services are still reported as failing and no matter where I acknowledge, pause, re-run, disable or re-enable: They don't go back to
Passing
. I did find #9, but that was fixed. How do services and instances normally return toPassing
in Cabot?Thanks, Sebastian