Closed Optimaximal closed 3 years ago
add storage monitoring to the plugin
Do you mean VMFS datastores? No, this is not possible as this is not a physical element and therefore won't appear in the CIM elements. Physical drives however are usually in the CIM element list, as long as the server hardware supports it. But you see this from your output already.
exclusion of storage monitoring (with something like -nodisk)
Yes, you can use the -i
/--ignore
parameter together with -r
/--regex
parameter to ignore all drives. Something like this should do it:
./check_esxi_hardware.py -H esxiserver -U root -P pass -i "Drive,Disk" -r
Sorry, mentioning Datastores was probably a red herring - I was referring to the Physical Disks visible in the iDRAC and the associated Virtual Disk(s) created on the onboard PERC, which are mounted as local datastores in ESXi.
Your plugin does not seem to have the option to query the storage exclusively (by excluding all other options), other than the general alarm when there's a warning. There also doesn't seem to be any way of collecting perf history of storage for the same reason.
I will using your ignore commands to suppress the warnings on the individual cloned elements, but unless I'm missing something obvious (or you're suggesting to use -i and ignore everything except Drive and Disk, there's no current way I can set up a service clone that is just geared towards monitoring the storage elements.
I was referring to the Physical Disks
Yes, physical disks/drives are monitored, as long as they appear in the elements sent by the CIM server (use verbose mode to see the list of CIM elements). If the physical drives don't show up in the list, then you might need to install additional VIBs from the hardware vendor (Dell OpenManage Offline Bundle and VIB for ESXi).
query the storage exclusively
That's right, the plugin does check all the CIM elements and using the -i
list, you can define which elements to exclude from the check.
set up a service clone that is just geared towards monitoring the storage elements
Although using the ignore list to achieve this, I don't know what you are trying to achieve with this? Why not simply check all the cim elements (hardware parts) and get alerted if one element fails? The plugin notifies what kind of element/hardware failed. Maybe I haven't seen such a practical use case before...
It could be a force of habit of my wanting to know granular information + learning Icinga2 and pushing it as hard as possible.
I've run the verbose output and, yes, the drive information is clearly there - what was the reasoning behind not producing perf data for the drive that would justify a separate exclusion etc.
Obviously it's your plugin = your choice. Maybe I need to learn Python and fork it 😄
what was the reasoning behind not producing perf data for the drive
Because the drives don't have any perf data on a CIM level. They only show their current status. You could only get performance data such as I/O from the OS (ESXi).
So if I understand you correctly, your feature request would be an exclusive parameter to only monitor specific elements (the opposite of the ignore parameter)? Is that right? (even though I still don't see what's there to gain defining multiple service checks)
Yes, I've just reviewed the output and see that data is only what is displayed, which is annoying.
You'd imagine Dell exposing something like the capacity metrics & more SMART information would be sensible, but c'est la vie... I'm not sure what would be required to install additional VIBs - is that done by installing them onto ESXi or is this something installed on the server executing the script, like SNMP MIBs (in this case, Icinga2 running on Ubuntu 18.04)?
I guess the feature request would be an optional --no-disk parameter that behaves the same way as the other values, effectively ignoring disk-related items. Perf data would be an OK, WARN (for Predicted Failures) or CRITICAL result from each disk.
I'm going to close this issue as I've realised that the iDRAC monitoring plugin can grab all the information from the server via a more direct means.
Thanks for your work anyway 😄
Is it possible to add storage monitoring to the plugin (including both physical & virtual disks and/or data stores) and also allow the exclusion of storage monitoring (with something like -nodisk)?
The plugin currently reports actual/predicted drive failures in the warning message, but I currently use the plugin with Icinga2 and I have all the commands configured as separate services in a set, only each service is now failing because of a drive failure in the RAID.
Edit - for reference, this is on Dell PowerEdge hardware.