Closed cemason closed 7 years ago
This issue should be fixed already but we have not yet rolled out the fix as part of a release (https://github.com/rcbops/rpc-maas/commit/eaee034134c53a73b655b4cf56a44e897d040e35). I assume the battery was bad in this case, in which version did it happen, what was the exact output of the HP utility ?
The problem is when 'Cache Status' is marked as 'Permanently Disabled', the 'Battery/Capacitor Status' is removed from hpssacli output. This is what causes the script to bomb out. I think the fix Bjoern added should account for this.
I believe this is resolved. We can re-open if any further issues are encountered with the version of hp_monitoring.py
from this repo.
Hello! This is regarding:
https://github.com/rcbops/rpc-maas/blob/bfc16f7ef0a9867887966bbe88e789222cb27f95/playbooks/templates/rax-maas/hp-check.yaml.j2
If 'hpssacli ctrl all show status' doesn't report details for the battery, it seems to break the script so that it falsely reports alerts for more than just the battery. I am looking at a case where the command the check appears to run doesn't show the "Battery/Capacitor Status" line the check seems to expect. Here is the full output:
This seems to be messing with the check so that it can't properly get the status of hp-memory and hp-processors, which it then reports alerts for (they are not actually in a bad state as confirmed by manual checkings).
If you run the monitoring script manually when Battery/Capacitor Status is not showing up as in the output above, it spews an error:
I haven't had a chance to look at the script too closely to see what in particular is breaking it. I hope I've provided enough info here but if I can provide more details please let me know.