Icinga / icinga2

The core of our monitoring platform with a powerful configuration language and REST API.
https://icinga.com/docs/icinga2/latest
GNU General Public License v2.0
2k stars 574 forks source link

Incorrect interpretation of units in perfdata normalization #9486

Open gino0631 opened 2 years ago

gino0631 commented 2 years ago

Describe the bug

As documented, Icinga normalizes performance data before passing it to metric backends, so that MB, MiB and similar become bytes. MB is interpreted as base 10, so that if a plugin reports 33762MB, 33762000000 bytes are recorded.

The problem is that many plugins historically calculate MB and similar using base 2 (actually, I'm not aware of a plugin that would use base 10). For instance, running /usr/lib64/nagios/plugins/check_disk --help will output the following explanation:

Note: kB/MB/GB/TB are still calculated as their respective binary units due to backward compatibility issues.

The change of behaviour in Icinga was most probably introduced with the support for new UoMs, as noticed in #9060.

To Reproduce

  1. Configure disk monitoring using disk command (this is by default for the monitoring host).
  2. Run the relevant plugin /usr/lib64/nagios/plugins/check_disk -c 10% -w 20% -m manually to see the results.
  3. Run /usr/lib64/nagios/plugins/check_disk -c 10% -w 20% -m -u MiB and verify that the numbers are the same, though the units are shown as MiB now.
  4. Check the performance data recorded in the InfluxDB. It will be normalized by multiplying by 1000*1000.

Expected behavior

Binary MB should be normalized by multiplying by 1024*1024, so that 33762MB should become 35402022912 bytes.

Your Environment

Possible workarounds

By default, check_disk is executed with -m option, so that the units are MB. It is possible to put vars.disk_units = "MiB" variable for disk command, so that the units become MiB, and get translated correctly.

However, not all plugins support such options. For example, Manubulon SNMP plugin check_snmp_storage.pl uses binary MB units (it can be verified by checking the source code), and there are no options to change that, so the fix is to modify the source code to output 'MiB' instead of 'MB'.

Another case is /usr/lib64/nagios/plugins/check_mem.pl which outputs KBs which actually mean KiBs.

I'm not sure if the right way is to change all the plugins to use the new unit system though. Maybe there should be an option in Icinga how to interpret MBs and similar - using base 10 or base 2.

Al2Klimov commented 1 year ago

It is possible to put vars.disk_units = "MiB" variable for disk command, so that the units become MiB, and get translated correctly.

Colleagues, shall we do this?

Al2Klimov commented 1 year ago

I'd even pass -u bytes.

gino0631 commented 1 year ago

And what to do with plugins that are not so flexible and use KB, meaning base 2?

RincewindsHat commented 1 year ago

side note: This is a long standing problem with the nagios plugins here.

gino0631 commented 1 year ago

In general, when it is not possible to get an output in bytes, maybe a better solution would be to accept the reality and interpret the units as it was for decades, unless it is known that a plugin really outputs MBs and similar using base 10.

Al2Klimov commented 1 year ago

IMAO all of this is the plugins' job. We, Icinga, can do only #9642/#9609.

gino0631 commented 1 year ago

@Al2Klimov maybe, but 1) incorrect data in Icinga appeared after a change in Icinga, 2) base 2 for MBs and similar was used for decades, so a big-bang migration approach is not realistic - it will take some time to change the plugins accordingly.