Closed jon-weisz closed 11 years ago
Create a monitor that will email or text if the cluster becomes unstable or behaves badly.
This should do something like: For server in server_dict: Test that server is not too busy Test if graspit is hung email or text someone if either condition is true for any active server.
Create a monitor that will email or text if the cluster becomes unstable or behaves badly.