When a policy's heartbeat duration is changed while there are heartbeat monitors in-progress, some machine (and process) heartbeat monitors may become out-of-date and the dead will eventually be orphaned if no further pings are sent.
For example:
Given a policy with a heartbeat duration of 600, if a machine is activated and a heartbeat monitor is started, and if the policy is then changed to have a heartbeat duration of 86000, when the machine's monitor runs at 600 seconds from time of activation, it will now see that the machine is not dead and will send a pong event and exit. Eventually, this machine dies if no further pings are sent, and since no pings were sent, there is then no heartbeat monitor to handle its death, and it remains unculled.
The same could happen for processes. I'm sure there are other scenarios like this. So we really need to set up a periodic job that deactivates orphaned dead machines i.e. those that no longer have a heartbeat monitor attached.
When a policy's heartbeat duration is changed while there are heartbeat monitors in-progress, some machine (and process) heartbeat monitors may become out-of-date and the dead will eventually be orphaned if no further pings are sent.
For example:
Given a policy with a heartbeat duration of 600, if a machine is activated and a heartbeat monitor is started, and if the policy is then changed to have a heartbeat duration of 86000, when the machine's monitor runs at 600 seconds from time of activation, it will now see that the machine is not dead and will send a pong event and exit. Eventually, this machine dies if no further pings are sent, and since no pings were sent, there is then no heartbeat monitor to handle its death, and it remains unculled.
The same could happen for processes. I'm sure there are other scenarios like this. So we really need to set up a periodic job that deactivates orphaned dead machines i.e. those that no longer have a heartbeat monitor attached.