bb-Ricardo / check_redfish

A monitoring/inventory plugin to check components and health status of systems which support Redfish. It will also create a inventory of all components of a system.
MIT License
115 stars 34 forks source link

unclear why some servers are WARNING/CRITICAL when using "--info" #111

Closed lgmu closed 1 year ago

lgmu commented 1 year ago

Hi,

checking a HPE ProLiant BL460c Gen9 with "--info --detailed" results in:

[CRITICAL]: INFO: HPE ProLiant BL460c Gen9 (CPU: 2, MEM: 512GB) - BIOS: I36 v2.80 (10/16/2020) - Serial: *** - Power: Off - Name: NOT SET
[OK]: iLO 4 - FW: 2.82
[OK]: Power Regulator Mode: Max - Power Auto On: PowerOn

But it's not really clear why it's CRITICAL now - when looking in the verbose output I can find:

 'SerialNumber': '***',
 'Status': {'Health': 'Critical', 'State': 'Disabled'},
 'SystemType': 'Physical',

Note: This also happens on servers that are turned on:

[WARNING]: INFO: HPE ProLiant BL460c Gen9 (CPU: 2, MEM: 512GB) - BIOS: I36 v3.02 (07/18/2022) - Serial: ***  - Power: On - Name: ***
 'Status': {'Health': 'Warning', 'State': 'Enabled'},

It would be great to add some additional info in the non verbose output.

Thanks

bb-Ricardo commented 1 year ago

Hi,

I'm aware of this issue but not really able to solve this. The iLO writes out this value and it's a summery alarm for the overall system status. It can be an component which sets this value to warning or critical.

Usually when you log in into iLO you can see where the issue is coming from.

lgmu commented 1 year ago

But isn't it possible to add something like:

[WARNING]: INFO: HPE ProLiant BL460c Gen9 (CPU: 2, MEM: 512GB) - BIOS: I36 v3.02 (07/18/2022) - Serial: - Power: On - Name: - OVERALL SYSTEM STATUS: Warning

bb-Ricardo commented 1 year ago

Of course, but that is redundant information. it states the status [WARNING] at the beginning of the line.

lgmu commented 1 year ago

I understand, thanks anyway