Linuxfabrik / monitoring-plugins

220+ check plugins for Icinga and other Nagios-compatible monitoring applications. Each plugin is a standalone command line tool (written in Python) that provides a specific type of check.
https://linuxfabrik.ch
The Unlicense
220 stars 51 forks source link

disk-usage: Add a parameter to select performance data #697

Closed TQQEU closed 12 months ago

TQQEU commented 1 year ago

This issue respects the following points:

Which variant of the Monitoring Plugins do you use?

Bug description

This is about the "disk-usage"-Plugin: https://github.com/Linuxfabrik/monitoring-plugins/tree/main/check-plugins/disk-usage

We use it in a large environment and already chaged our disk-space-monitoring to it. At the moment its rolled out to about 1400 hosts, one of those looks like this at the moment:

image

Icingaweb also visualizes those performance data using some "circles":

image

Our expectation would be that these circles represent the level using the "-usage" metric.

In fact, however, we only see the "-total" metric here, which only tells us the total size of the volume, and which is only exciting if there have been changes in the size of a volume. Yet, due to its relative size, it pushes its way to the front of the display (the five "fullest" metrics).

Furthermore, there is a performance metric called "-percent" which, from our point of view, offers no added value at all and should only be interesting for a normalized graphical representation.

Here is a screenshot of all "performance metrics" of that Service given: image

The little red arrows poiting to the "actually interesting" performance data for us.

to Fix

We would like to be able to exclude the patterns "-total" and "-percent" from being output as performance data when calling the plugin. This would also help us to reduce the growth of our performance value database.

Steps to reproduce - Plugin call

disk-usage --critical '10%FREE' --warning '15%FREE'

Steps to reproduce - Data

Any check using the plugin should show that buggy behavoir in an Icignaweb2 environment.

Environment

Icinga 2.13.7 Icingaweb 2.11.4

Plugin Version

disk-usage: v2023051201 by Linuxfabrik GmbH, Zurich/Switzerland

Python version

No response

List of Python modules

No response

Additional Information

No response

markuslf commented 12 months ago

So you could use disk-usage --perfdata-regex='-usage'

TQQEU commented 6 months ago

Hi @markuslf,

we staged this new version to our systems during last weeks and were finally able to test the new option - it works as described when called directly.

But there is also a limitation when using it with real-world-icinga2:

Icinga2 wants to make a call in the following form: '/my/plugin/directory/disk-usage' '--critical' '10%FREE' '--warning' '15%FREE'

Please note the form '<binary>' '<option>' '<value>' '<option>' '<value>'. The parameter does not work when there is any space in between option or value and the =. This causes the plugin to print an error once we try to somehow define the value for perfdata-regex using an Icinga variable.

We can make Icinga2 use this new option anyway, but this violates the usual syntax and does not allow us to set it in the usual way - with a variable (containing the filter-regex) in the definition of the Service object.

Could you make the plugin work with something like '/my/plugin/directory/disk-usage' '--critical' '10%FREE' '--warning' '15%FREE' '--perfdata-regex' '-usage' ?

EDIT: Works perfectly fine once you drop the = and escape the - using \: '/my/plugin/directory/disk-usage' '--critical' '10%FREE' '--warning' '15%FREE' '--perfdata-regex' '\-usage'

So no issue, just a misleading example. Big thanks to my colleague who figured this out. :dotted_line_face:

markuslf commented 6 months ago

Glad you figured it out. And yes, this is an Icinga issue and may work as expected in other monitoring systems.

When scripting, using `--parameter=value' can avoid some problems where the value starts with a dash (e.g. negative numbers or other options with a dash prefix) which could be interpreted as another parameter.

It also prevents errors in cases where a parameter is optional and its value could be misinterpreted as another standalone parameter if there is an error in the script syntax or execution order.

This is [described in our main README] (https://github.com/Linuxfabrik/monitoring-plugins?tab=readme-ov-file#command-parameters-and-arguments). Finally, this is how argparse works, which is the library we use to parse command line arguments.