centreon / centreon-plugins

Collection of standard plugins to discover and gather cloud-to-edge metrics and status across your whole IT infrastructure.
https://www.centreon.com
Apache License 2.0
311 stars 274 forks source link

[os::linux::local::plugin] mode "storage" - Wrong percent values in verbose output #3973

Open Aleksey-Maksimov opened 2 years ago

Aleksey-Maksimov commented 2 years ago

Hello.

The verbose output of the "storage" mode seems to display percentages incorrectly. Here is an example:

# ./centreon_plugins.pl --plugin=os::linux::local::plugin --mode=storage 
--change-perfdata='.*,,eval(%(label) =~ s/\//rootfs_/)' --change-perfdata='.*,,eval(%(label) =~ s/\//_/g)' 
--warning-usage='85' --critical-usage='95' --filter-type='^(?!(tmpfs|devtmpfs)\z)' --filter-uom='null' 
--use-new-perfdata --verbose 

OK: All storages are ok | 
'rootfs_#used'=6683459584;0:25450039910;0:28444162252;0;29941223424 
'rootfs_boot_efi#used'=7147520;0:212791910;0:237826252;0;250343424
Storage '/' Usage Total: 27.88 GB Used: 6.22 GB (23.54%) Free: 20.22 GB (76.46%)
Storage '/boot/efi' Usage Total: 238.75 MB Used: 6.82 MB (2.86%) Free: 231.93 MB (97.14%)

6683459584 from 29941223424 as a percentage is 22.3%, not 23.54%

garnier-quentin commented 2 years ago

I use the values from df commands (i don't change it)

Aleksey-Maksimov commented 2 years ago

In my opinion, it is more correct to make a calculation from the data that appears in the perfdata. Now the output of the plugin shows us distorted data that does not correspond to the graph, which is built according to the metrics from the perfdata.

изображение

garnier-quentin commented 2 years ago

Maybe the issue is in perfdata. Because i use for prct_used:

used * 100 / (used + free)

I don't use total because of space reservation. Maybe i should use 'used + free' in perfdata for the total (to have the same percentage value)

lucie-dubrunfaut commented 5 months ago

Hello :)

Without additional information we are not able to provide a solution or resolve a possible bug. So @Aleksey-Maksimov if you still need a response / solution to this issue, can you tell us if you still have the problem despite the explanations provided by Quentin above ?

Aleksey-Maksimov commented 5 months ago

Hello Lucie,

Quentin did not solve this problem, but only discussed this topic. The problem has not gone away. The problem is that the output of the plugin and perfdata give different calculated disk full percentage

/centreon_plugins.pl '--plugin=os::linux::local::plugin' '--mode=storage' \
'--change-perfdata' '.*,,eval(%(label) =~ s/#storage//g)' '--critical-usage' '95' \
'--filter-type' '^(?!(tmpfs|devtmpfs)\z)' '--filter-uom' 'null' \
'--use-new-perfdata' '--verbose' '--warning-usage' '85'

OK: All storages are ok | 

'/#used'=4340510720;0:6605935616;0:7383104512;0;7771688960 
'/boot/efi#used'=6070272;0:507916697;0:567671603;0;597549056 
'/var/backup#used'=14402281472;0:17415990272;0:19464930304;0;20489400320

Storage '/' Usage Total: 7.24 GB Used: 4.04 GB (59.01%) Free: 2.81 GB (40.99%)
Storage '/boot/efi' Usage Total: 569.87 MB Used: 5.79 MB (1.02%) Free: 564.08 MB (98.98%)
Storage '/var/backup' Usage Total: 19.08 GB Used: 13.41 GB (74.15%) Free: 4.68 GB (25.85%)

Let's look at the root partition for example. Let's see what we see from the perfdata:

Total = 7771688960 Used = 4340510720

(4340510720 * 100) / 7771688960 = 55.58%

And now let's compare the perfdata and in the script output

In the plugin output we see a different % value - 59.01%

59.01% not equal 55.58%

Perhaps there is some problem with calculating the total size for the MAX value in perfdata.

Because the partition size data from perfdata looks underestimated, while the data from the plugin output looks more like the truth, but it is still not equal to the indicator from the df utility

The actual disk capacity according to df is as follows:

# df -H -t ext4

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2       7.8G  4.4G  3.1G  60% /
/dev/sda3        21G   15G  5.1G  75% /var/backup
# df -h -t ext4
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2       7.3G  4.1G  2.9G  60% /
/dev/sda3        20G   14G  4.7G  75% /var/backup
lucie-dubrunfaut commented 5 months ago

Hello :)

First of all, thank you for this very clear and precise answer (as much as for your patience 🙏). I took note of all this and opened an internal process to look at this issue. We will come back to you when we have been able to analyze the outcome and have more answers. Thank you for your understanding.

omercier commented 3 months ago

Hi @Aleksey-Maksimov, As @garnier-quentin said, using the used+free sum to calculate the prct_used excludes the reserved space from the total. I think it is a good thing to display it since reaching 100% on this will block processes not owned by root from writing to the disk. To satisfy your needs, I propose to add an option --ignore-reserved-space to calculate this percentage on this basis: used * 100 / total. What do you think?

omercier commented 3 months ago

Please can you also send the result of df -P -k -T please?

Aleksey-Maksimov commented 3 months ago

I propose to add an option --ignore-reserved-space to calculate this percentage on this basis: used * 100 / total. What do you think?

It will probably be better than what we have now.

Aleksey-Maksimov commented 3 months ago
# df -H -t ext4
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2       7.8G  4.4G  3.1G  60% /
/dev/sda3        21G   15G  4.9G  75% /var/backup
# df -h -t ext4
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2       7.3G  4.1G  2.9G  60% /
/dev/sda3        20G   14G  4.6G  75% /var/backup
# df -P -k -T
Filesystem     Type     1024-blocks     Used Available Capacity Mounted on
udev           devtmpfs      957992        0    957992       0% /dev
tmpfs          tmpfs         195160      412    194748       1% /run
/dev/sda2      ext4         7589540  4246440   2936112      60% /
tmpfs          tmpfs         975788        0    975788       0% /dev/shm
tmpfs          tmpfs           5120        0      5120       0% /run/lock
/dev/sda1      vfat          583544     5928    577616       2% /boot/efi
/dev/sda3      ext4        20009180 14211328   4756088      75% /var/backup
tmpfs          tmpfs         195156        0    195156       0% /run/user/446479
# ./centreon_plugins.pl '--plugin=os::linux::local::plugin' '--mode=storage' 
'--change-perfdata' '.*,,eval(%(label) =~ s/#storage//g)' '--critical-usage' '95' 
'--filter-type' '^(?!(tmpfs|devtmpfs)\z)' '--filter-uom' 'null' '--use-new-perfdata' 
'--verbose' '--warning-usage' '85'
OK: All storages are ok | 
'/#used'=4348379136;0:6605935616;0:7383104512;0;7771688960 
'/boot/efi#used'=6070272;0:507916697;0:567671603;0;597549056 
'/var/backup#used'=14552399872;0:17415990272;0:19464930304;0;20489400320
Storage '/' Usage Total: 7.24 GB Used: 4.05 GB (59.12%) Free: 2.80 GB (40.88%)
Storage '/boot/efi' Usage Total: 569.87 MB Used: 5.79 MB (1.02%) Free: 564.08 MB (98.98%)
Storage '/var/backup' Usage Total: 19.08 GB Used: 13.55 GB (74.92%) Free: 4.54 GB (25.08%)

If I understand correctly, for the root partition in df we see that: Max = 7589540 blocks = 7771688960 bytes (100%) Used = 4246440 blocks = 4348354560 bytes (55.95% of 7771688960) Available = 2936112 blocks = 3006578688 bytes (38.69% of 7771688960) The difference of 5.36% is the spare space?

Quentin says that prct_used is now calculated using the formula: used 100 / (used + free) That is: prct_used: used 100 / (used + free) = 4348354560 * 100‬ / (4348354560 + 3006578688) = 59.12%

We see that in the main output of the plugin the same 59.12%

But the problem is that if we use data from the perfdata "4348379136;0:6605935616;0:7383104512;0;7771688960" for calculation, we will not get such a percentage

4348379136 * 100 / 7771688960 = 55.95%

That is, if the plugin uses the formula "used * 100 / (used + free)", then the sum (used + free) must be written to the MAX value of the perfdata.