Open jamesottaway opened 7 years ago
@jamesottaway oh hai! Just to confirm, you'd like to see how long an agent has been connected for on the Agents List page itself?
I was thinking the host's uptime, but knowing how long the agent has been connected would serve a similar purpose.
Ohhh...uptime, right right. I was thinking agent uptime instead of the actual servers uptime! It's an interesting idea! Probably starts creeping into territory that the agent should stay out of (server monitoring).
The problem I'm trying to solve is to easily identify agents in our elastic cluster which I would've expected to shutdown due to lack of jobs overnight, but are still running the next morning.
Perhaps there's something in CloudWatch we could offer instead? /cc @lox @toolmantim
I'd be interested to see also how many seconds an agent has been active, vs idle. This would allow me to make finer-grained capacity decisions.
From a technical perspective, querying each running agent when loading the list is a little odd, but maybe with some caching this could still be feasible.
The problem I'm trying to solve is to easily identify agents in our elastic cluster which I would've expected to shutdown due to lack of jobs overnight, but are still running the next morning.
Most of the time it's due to a build step hanging, but not always, hence the idea of narrow our investigation process down to the oldest agents.