The bot manager tells the bot monitor what bots to keep track of. Trackers of unassigned bots are considered stale and dropped so we keep the tracker list size under control.
The publisher broadcasts agent.status.active bot metrics for the bots that have any request latency measurement.
New bot monitor catches these messages and updates the activity timestamps.
Bot activity tracker:
Saves the activity timestamp whenever there is a new activity
Finds out and returns the status of the bot by looking at the timestamps it knows (active or inactive)
Saves the read timestamp and applies a read cooldown so we don't detect the same bot as inactive too quickly (and make too-quick decisions)
The bot monitor publishes agent.status.inactive metrics for the inactive bots.
The timeouts are like:
Read cooldown: We are able to detect inactivity statuses every 5m even though it may be requested every 15s.
Inactivity threshold: We consider a bot to be inactive if the last activity was more than 15m ago.
The inactive bot containers are exited. Exited bot containers are restarted, redialed and reinitialized.
agent.status.active
bot metrics for the bots that have any request latency measurement.agent.status.inactive
metrics for the inactive bots.