arachnys / cabot

Self-hosted, easily-deployable monitoring and alerts service - like a lightweight PagerDuty
MIT License
5.59k stars 594 forks source link

Status of Cabot itself #592

Closed okev closed 6 years ago

okev commented 6 years ago

I'd like to run a separate monitoring utility against Cabot just to make sure that Cabot is running OK. I can ping the api endpoint / source of the dashboard for some keywords, which gives me a general state of health of the web/api components.

Is there a way to expose / test the overall health of Cabot, meaning web, beat, worker, MQ and database overall?

Thanks

JeanFred commented 6 years ago

Is there a way to expose / test the overall health of Cabot, meaning web, beat, worker, MQ and database overall?

Currently /status serves as a health-check endpoint, via the checks_run_recently view, which checks whether any check has run in the last 10 minutes.

If Cabot is completely down (db or web process), then that endpoint is inaccessible ; if the web process is up but the workers or the MQ are not, then it returns that Checks are not running.

(At Arachnys, we historically used Jenkins to monitor that endpoint, now we also have a secondary Cabot [each monitoring the other one]).

Would that fit your needs?

frankh commented 6 years ago

I believe the /status endpoint should work. Please re-open if you think there should be another kind of healthcheck