hubotio / hubot

A customizable life embetterment robot.
https://hubotio.github.io/hubot/
MIT License
16.64k stars 3.75k forks source link

Hubot Health/Monitoring #1672

Closed johnseekins-pathccm closed 1 year ago

johnseekins-pathccm commented 1 year ago

From what I can see...a default install of Hubot doesn't have any consistent way to introspect into the system (a la Prometheus metrics, statsd, etc.). Is there any standard pattern for monitoring Hubot in containerized scenarios?

joeyguerra commented 1 year ago

Not that I know of. What are you thoughts?

joeyguerra commented 1 year ago

I just remembered. I replaced the logger with Pino, which outputs logs to stdout in JSON for this situation. Hope that helps.

johnseekins-pathccm commented 1 year ago

The logs are nice, yes. But what I'm interested in is a way to introspect into the service and know whether or not it's running. A common pattern is curl -Ss localhost:80/healthz and get a 200 response.

The reason I'm looking for something like this is because in containerized environments, it essentially impossible to just run (e.g.) ps aux | grep hubot to tell if a service is running.

joeyguerra commented 1 year ago

Now that you mention it, I always have a script that responds to /healthz.

there’s also the hubot-diagnostics module that responds to messages. These are the only patterns I’ve seen used thus far.

I can add some additional diagnostics to hubot-diagnostics. Do you have any suggestions?

johnseekins-pathccm commented 1 year ago

Tell me about your "script that responds to /healthz". When I was initially running hubot, it didn't seem to have an open port at all...I'd be happy to leverage a external-script for this if that's the right pattern!

joeyguerra commented 1 year ago

Hubot starts an Express web service on port 8080 (http://localhost:8080) by default. You can disable the web service by setting the HUBOT_HTTPD environment variable to false.

The port can be defined with the PORT environment variable.

I create a file in the scripts directory called HealthZ.mjs with the following contents:

// Description:
// Listens to /healthz for k8s deployment.
//
// Dependencies:
//
// Configuration:
//
// Notes:
//
// Author:
//   Joey Guerra

export default robot => {
    robot.router.get('/healthz', (req, resp) => {
        resp.writeHead(200, {'Content-Type': 'text/plain'})
        resp.end(`Hello? Is it me you're looking for?`)
    })
    robot.respond(/helo/i, async res => {
        res.reply('Hi!')
    })
}

Hubot's router variable is the Express application instance, which can be used to set route handlers in scripts via robot.router.

As an example of what can be done, I created a static site generator as a script and built my website with Hubot.

hubot-diagnostics is another technique that I've seen used to introspect the Hubot app.

What do you think?

johnseekins-pathccm commented 1 year ago

That's excellent. I think I'll steal that little script if that's okay.

Are there any docs that indicate Hubot's default server/port config? I didn't see any.

joeyguerra commented 1 year ago

There's documentation for the HTTP Listener. The public website for Hubot docs hasn't been updated to the latest version. We would need someone in Github to update it :(.

johnseekins-pathccm commented 1 year ago

Ah! That would explain it. Thanks for all the help, @joeyguerra .

technicalpickles commented 1 year ago

The only problem with using an HTTP endpoint to monitor, is that it only tells you if the process is up and responding to HTTP requests, but not how well (or not) it has handling chat message. That is the real thing you care about being up.

One approach I've used/seen/thought of to create a room/channel/whatever that another automated services posts a 'heartbeat' chat message. That way, you can add a listener to check for it, and record the last time it was received. That way, you can make an HTTP endpoint that includes the last heartbeat.

johnseekins-pathccm commented 1 year ago

@technicalpickles You're absolutely right that an HTTP monitor isn't perfect. It is useful for general "is this service even responding?" test. A full validation of Hubot being functional it is definitively not. But it does give me some certainty that Hubot is at least running.