Closed Cylindric closed 2 years ago
I agree with your suggestions. I will rewrite the script, but in addition to the timeout (which takes into account only your situation), I will add a notification to the email. I am an inexperienced github user, so I do not know how to use Pull requests. Therefore, I will rewrite the script and make a 0.5.2 release.
It seems to have turned out =)
Sometimes, I have a short-term situation on my cluster where I spin up a lot of VMs for a while, causing high memory utilisation. As this condition goes away once those VMs are destroyed, I don't think a high-RAM condition should permanently kill the load-balancer.
I've modified the
cluster_load_verification
slightly to return a status on a recoverable failure. The main loop then sees the low-memory condition as something it can try again later.