Open stanhu opened 3 years ago
Could this be accomplished if we added something that responded to SIGINFO
or something like that?
I think SIGINFO
is a BSD construct, so this would only be supported in macOS or FreeBSD. Using signals in general isn't the most cross-platform friendly way of doing monitoring.
It doesn't need to be SIGINFO
specifically. Linux supports signals generally. And I think there are a lot of other issues with mailroom that would prevent running on windows anyway.
In general, I'm apprehensive about adding webrick and a web service into the mix.
I'd rather have another repo/project that could provide a web interface of some sort, that was able to query mailroom through some other means 🤔
Would it be preferable instead to have mail_room report out, like a heartbeat? I feel like that would require less code, generally be safer.
In this case, I think that is more complex because now you need to have a separate process that determines whether the process is alive. HTTP liveness probes are a common practice in Kubernetes: https://www.magalix.com/blog/kubernetes-and-containers-best-practices-health-probes
As for push vs pull for metrics, Prometheus has written extensively why they prefer a pull model for monitoring, particular for detecting a downed service:
Okay. Re-reviewing this.
If we're going to do it, I'd like to change a few things.
nil
, but a NoopHealthCheck
, been trying to keep from using nil configuration and the &.
patternI can leave more specific comments in the code, if that is helpful. Sorry I didn't get back to this for months and months (new baby).
@tpitale Congrats on your new arrival! I've updated this pull request; let me know what you think.
When MailRoom is run in Kubernetes, we have found occasions where MailRoom appears to have attempted to stop running, but
Net::IMAP
is stuck waiting for threads (https://github.com/ruby/net-imap/issues/14).This commit adds an HTTP liveness checker to enable detection of a terminated MailRoom pod.