Closed gabortakacs78 closed 1 year ago
Hi,
was there any status output except OK
for the storage with the old plugin if you use --detailed
on the effected servers?
Hi,
sorry for late answer, I didn't get any notification from your answer :S
here is the end of output (exit status) of OLD version: [OK]: Chassi 1 : All fans (1) are in good condition [OK]: Chassi enclosurechassis : All fans (1) are in good condition [OK]: All memory modules (Total 768GB) are in good condition [OK]: All processors (2) are in good condition [OK]: Status of HP SmartArray and all components is: OK [OK]: INFO: HPE Synergy 480 Gen10 (CPU: 2, MEM: 768GB) - BIOS: I42 v2.60 (01/13/2022) - Serial: CZJxxxxxxx - Power: On - Name: xxxxxxx|'Fan_1.1'=21%;; 'Fan_enclosurechassis.1'=21%;;
And the new PLUGIN: [UNKNOWN]: No storage controller and disk drive data found in system [UNKNOWN]: Request error: No array controller data returned for API URL '/redfish/v1/Systems/1//SmartStorage/ArrayControllers?$expand=.' [OK]: Chassi 1 : All fans (1) are in good condition [OK]: Chassi enclosurechassis : All fans (1) are in good condition [OK]: All memory modules (Total 768GB) are in good condition [OK]: All processors (2) are in good condition|'Fan_1.1'=21%;; 'Fan_enclosurechassis.1'=21%;;
The beginning JSON part (detailed result) looks the same for both versions.
Br,
Does this server have storage components?
To tell the true I don't know... Can I check it form detailed output? (maybe from old version) I am already raised this question to Hardware colleagues (I am just responsible for monitoring), but still waiting for their answer.
Yes, just use --storage --detailed --inventory
then you can see what is actually available. If there are no storage controller or hard drives then the server has no storage components.
If server has no storage then you need to disable storage monitoring for these servers.
Hi,
it shows no DATA with both version: { "inventory": { "chassi": [], "fan": [], "firmware": [], "logical_drive": [], "manager": [], "memory": [], "network_adapter": [], "network_port": [], "physical_drive": [], "power_supply": [], "processor": [], "storage_controller": [], "storage_enclosure": [], "system": [], "temperature": [] }, "meta": { "data_retrieval_issues": { "storage_controller": [ "No array controller data returned for API URL '/redfish/v1/Systems/1//SmartStorage/ArrayControllers?$expand=.'" ] }, "duration_of_data_collection_in_seconds": 0.375447, "host_that_collected_inventory": "xxxxx", "inventory_id": null, "inventory_layout_version": "xxx", "script_version": "xxx", "start_of_data_collection": "2022-09-20T15:45:53+02:00" } }
So it means there was some changes in check_redfish, which caused the change of return status. As I thinked, version 1.3.1 returned with OK in case of missing / no array controllers, but newest version 1.4.1 returned with status UNKNOWN. In this case we need to separate servers with / without storage, and define different checks for them.
It was also my proposal to servers guys on 6th Sept, but still no feedback from them. Maybe summer holidays finish soon...
Thanks for your help!
Hi,
Well the old behavior was not correct as it reported OK for non existent components. Now it let's you know that no components to monitor were found.
This is important in case you have a server with storage components but they are, for some reason, are not reported. Then you would assume everything is OK even though nothing is monitored.
In the current implementation you will get an UNKNOWN if you try to monitor storage but no storage is reported.
It is important to know if a server has storage or power supply components in order to get correct monitoring results.
I've seen it quite a few times that an ILO reports components incorrectly and then your monitoring is pretty much worthless as you don't 'see' the real status of the components.
Hi,
OK, I will update arguments once it is agreed from business side (HW guys) also. Thanks a lot for your support!
Br,
No problem. You are welcome.
Hi,
once we updated plugin to latest version, we have a lot of UNKNOWN messages on our servers: [UNKNOWN]: No storage controller and disk drive data found in system [UNKNOWN]: Request error: No array controller data returned for API URL '/redfish/v1/Systems/1//SmartStorage/ArrayControllers?$expand=.' ...
I have checked the verbose output for same test server with old version (1.3.0 and 1.3.1) and new version, and everything seems to be the same, except the final Return Status and message, which contains this error in new version, and shows OK in old version: ... [OK]: Status of HP SmartArray and all components is: OK ...
Was there any changes related this setting? Should I remove "--storage" argument manually from each affected checks? Or is there any option to skip it? (similar to "--ignore_missing_ps"?
Thanks!