lausser / check_hpasm

A plugin (monitoring-plugin, not nagios-plugin, see also http://is.gd/PP1330) which checks the hardware health of HP Proliant Servers. (May also be used for other devices which implement the CPQHLTH mib)
http://labs.consol.de/nagios/check_hpasm/
GNU General Public License v2.0
16 stars 18 forks source link

ProLiant: Show detailed power supply status/condition if not ok #21

Closed matsimon closed 11 months ago

matsimon commented 7 years ago

cpqHeFltTolPowerSupplyCondition can only have the following values: other(1), ok(2), degraded(3), failed(4).

However if something isn't OK, show more detailed information based on cpqHeFltTolPowerSupplyErrorCondition and cpqHeFltTolPowerSupplyStatus.

This helps (a human) to more easily determine the required action since i.e. the error condition powerinputloss doesn't require the replacement of a power supply whereas a fan failure usually means that the PSU needs replacing since they aren't replaceable parts within most HPE PSUs.

Change tested with following ProLiant servers:

matsimon commented 7 years ago

Hi Gerhard

This is a great plugin so first, let me thank you for the work you have put into it.

This commit has been tested in production so far against a couple of machines I've not explicitely mentioned in the commit - however it isn't really helpful for machines with single power supplies. However no regressions have been found with such machines like HP ProLiant DL320 Gen8 v2 and HPE ProLiant DL20 Gen9.

Testing hardware to validate and some of the time required to validate the output against real hardware have been provided by my employer Adfinis SyGroup AG.

Lookiing forward to your feedback.

Best regards Mathieu