monitoring-plugins / monitoring-plugins

Monitoring Plugins
https://www.monitoring-plugins.org
GNU General Public License v3.0
487 stars 284 forks source link

check_disk -- exclude_device not working properly #1536

Open Rocco83 opened 6 years ago

Rocco83 commented 6 years ago

Hi,

Tested on Debian Stretch.

# /usr/lib/nagios/plugins/check_disk -V
check_disk v2.1.1 (monitoring-plugins 2.1.1)

I have some issue with "-x" flag of check_disk. From the help

 -x, --exclude_device=PATH <STRING>
    Ignore device (only works if -p unspecified)

with -x, is possible to specify a path, but this accept also wildcards. My goal is to check all of the disk, but not what is mounted on /mnt (and recursive) and /media (and recursive). As -x is not recursive, i have added "/*" after the path, which means that i will have to manage only one level of depth for mountpoints.

The problem is that if there are >1 directory on the mountpoint, the output is truncated, and only root filesystem is checked. Side note: if i have two directory and one of it is mounted, the output is even worst.

Short reproducer:

Check working

# ls -l /media/
total 4
drwxr-x--- 2 root root 4096 May 29 12:01 cdrom
# /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -W 20% -K 10% -e -X tracefs -X tmpfs -X udev -X usbfs -X fuse.sshfs -X fuse.gvfs-fuse-daemon -X fuse.gvfsd-fuse -X devtmpfs -X none 
-X nfs -X nfs4 -x /mnt/* -x /media/* -v
DISK OK / 1623 MB (34% inode=76%); /var 1747 MB (37% inode=50%); /boot 100 MB (56% inode=99%);| /=3016MB;3928;4419;0;4911 /var=2892MB;3928;4419;0;4911 /boot=77MB;150;169;0;188

See that all of the disks are reported as by "-v"

Check broken with 2 directories

# mkdir /media/cdrom1
# ls -l /media/
total 8
drwxr-x--- 2 root root 4096 May 29 12:01 cdrom
drwxr-x--- 2 root root 4096 May 29 12:34 cdrom1
#                                                                                                                                                                  
# /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -W 20% -K 10% -e -X tracefs -X tmpfs -X udev -X usbfs -X fuse.sshfs -X fuse.gvfs-fuse-daemon -X fuse.gvfsd-fuse -X devtmpfs -X none 
-X nfs -X nfs4 -x /mnt/* -x /media/* -v
DISK OK| /=3016MB;3928;4419;0;4911

See that only "/" is reported.

Check broken with 2 directories and one mountpoint in one of the two directory

# mount /media/cdrom
# /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -W 20% -K 10% -e -X tracefs -X tmpfs -X udev -X usbfs -X fuse.sshfs -X fuse.gvfs-fuse-daemon -X fuse.gvfsd-fuse -X devtmpfs -X none -X nfs -X nfs4 -x /mnt/* -x /media/* -v
DISK UNKNOWN - free space:|

See that "UNKNOWN" is reported.

This last use case is likely linked to the primary issue.

Rocco83 commented 6 years ago

I had some time to check this issue. I have installed the latest git version of monitoring-plugins on a debian jessie. The problem seems related to the parsing of the options.

According to the manual, https://www.monitoring-plugins.org/doc/man/check_disk.html, options without a option are not possible.

Usage:
 check_disk -w limit -c limit [-W limit] [-K limit] {-p path | -x device}
[-C] [-E] [-e] [-f] [-g group ] [-k] [-l] [-M] [-m] [-R path ] [-r path ]
[-t timeout] [-u unit] [-v] [-X type] [-N type]

But, according to the below test, the argument are considered.

I have created this short reproducer . Create a file and pass this as argument, no matter the order

# touch test
# ./check_disk -w 20% -c 10% test
DISK WARNING - free space: / 4004 MB (20% inode=50%);| /=15129MB;16024;18027;0;20030
# ./check_disk test -w 20% -c 10%
DISK OK - free space: / 4004 MB (20% inode=50%);| /=15129MB;;;0;20030

Remove now the file and perform any of the above test

# rm test
# ./check_disk test -w 20% -c 10%
DISK CRITICAL - test is not accessible: No such file or directory
# ./check_disk /boot/config-3.16.0-6-amd64 -w 20% -c 10% test
DISK OK - free space: /boot 99 MB (56% inode=99%);| /boot=77MB;;;0;188
# 

Seems that check_disk is looking for the files and directories that are added in the command line without an argument, and return the space free for the item requested.

What happens on "-x /media/" is that is expanded (default using bash) and 2 different directories are then added into the cmd, leading to an additional hidden argument.

Running with gdb (spaces are truncated, but you can see two directories, the second one without an argument),

# cat /proc/32092/cmdline 
[...]/check_disk-w20%-c10%-x/media/test1/media/test2

Of course a short workaround is possible, blocking bash expansion

# ./check_disk -w 20% -c 10% -x '/media/*'
DISK WARNING - free space: / 4004 MB (20% inode=50%); /dev 10 MB (100% inode=99%); /run 342 MB (89% inode=99%); /dev/shm 952 MB (100% inode=99%); /run/lock 5 MB (100% inode=99%); /sys/fs/cgroup 952 MB (100% inode=99%); /boot 99 MB (56% inode=99%);| /=15129MB;16024;18027;0;20030 /dev=0MB;8;9;0;10 /run=38MB;304;342;0;381 /dev/shm=0MB;761;856;0;952 /run/lock=0MB;4;4;0;5 /sys/fs/cgroup=0MB;761;856;0;952 /boot=77MB;150;169;0;188
#

Running a bit deeper, seems that the impacted point is https://github.com/monitoring-plugins/monitoring-plugins/blob/master/plugins/check_disk.c#L206

Running with an argument (eg ./check_disk test -w 20% -c 10% **test**)

206       if (path_selected == FALSE) {
(gdb) n
216       np_set_best_match(path_select_list, mount_list, exact_match);

Running without an argument (eg ./check_disk test -w 20% -c 10%)

206       if (path_selected == FALSE) {
(gdb) n
207         for (me = mount_list; me; me = me->me_next) {

So path_selected is also populated by "-p" option documented in check_disk. -p seems to be taken as "default option" when no options are specified.

 -p, --path=PATH, --partition=PARTITION
    Mount point or block device as emitted by the mount(8) command (may be repeated)

And of course, given that a path is specified, -x is then silently discarded

 -x, --exclude_device=PATH <STRING>
    Ignore device (only works if -p unspecified)

Finally please see the combination of some tests

no -p, only the first entry is taken

# ./check_disk -w 20% -c 10% tests/ /boot/grub/  
DISK WARNING - free space: / 4004 MB (20% inode=50%);| /=15129MB;16024;18027;0;20030

two -p, both options are evaluated

# ./check_disk -w 20% -c 10% -p tests/ -p /boot/grub/
DISK WARNING - free space: / 4004 MB (20% inode=50%); /boot 99 MB (56% inode=99%);| /=15129MB;16024;18027;0;20030 /boot=77MB;150;169;0;188

one -p on the second argument, only the entry not specified with -p is taken

# ./check_disk -w 20% -c 10% tests/ -p /boot/grub/
DISK WARNING - free space: / 4004 MB (20% inode=50%);| /=15129MB;16024;18027;0;20030

one -p on the first argument, both options are evaluated

# ./check_disk -w 20% -c 10% -p tests/ /boot/grub/
DISK WARNING - free space: / 4004 MB (20% inode=50%); /boot 99 MB (56% inode=99%);| /=15129MB;16024;18027;0;20030 /boot=77MB;150;169;0;188

What's your view of this issue? Which should be the correct behavior of check_disk? Should be something without an option parsed, or an error should be thrown?

Thank you very much, Daniele

RincewindsHat commented 8 months ago

Hi @Rocco83, Sorry for not answering and thank you for the deep investigation. I honestly didn't really think about whether check_disk does support path arguments without an option and I think it should not do that at all. If you still have thoughts on this issue, we can investigate the topic.