lausser / check_nwc_health

nwc = network component. This plugin checks lots of aspects of routers, switches, wlan controllers, firewalls,.....
GNU General Public License v2.0
146 stars 88 forks source link

usage of mode disk-usage #293

Closed stefangweichinger closed 2 years ago

stefangweichinger commented 2 years ago

I am a newbie with this check plugin, so forgive me if I ask an FAQ (I couldn't google it):

I am trying to write checks for a device giving me this:

# /usr/lib/nagios/plugins/check_nwc_health  --hostname --community secret42 --mode hardware-health -vv
info: checking storages
hrStorageAllocationUnits: 4096
hrStorageDescr: /dev/shm
hrStorageIndex: 35
hrStorageSize: 259032
hrStorageType: hrStorageFixedDisk
hrStorageUsed: 0
name: /dev/shm
special: 0
info: storage 35 (/dev/shm) has 100.00% free space left

hrStorageAllocationUnits: 4096
hrStorageDescr: /tmp
hrStorageIndex: 36
hrStorageSize: 259032
hrStorageType: hrStorageFixedDisk
hrStorageUsed: 5
name: /tmp
special: 0
info: storage 36 (/tmp) has 100.00% free space left

hrStorageAllocationUnits: 4096
hrStorageDescr: /run
hrStorageIndex: 37
hrStorageSize: 259032
hrStorageType: hrStorageFixedDisk
hrStorageUsed: 1
name: /run
special: 0
info: storage 37 (/run) has 100.00% free space left

hrStorageAllocationUnits: 1024
hrStorageDescr: /mnt/persistent
hrStorageIndex: 39
hrStorageSize: 27786243
hrStorageType: hrStorageFixedDisk
hrStorageUsed: 159716
name: /mnt/persistent
special: 0
info: storage 39 (/mnt/persistent) has 99.43% free space left

info: checking devices
hrDeviceID: 0.0
hrDeviceIndex: 196608
hrDeviceStatus: running
hrDeviceType: hrDeviceProcessor
info: hrDeviceProcessor () has status running

hrDeviceID: 0.0
hrDeviceIndex: 196609
hrDeviceStatus: running
hrDeviceType: hrDeviceProcessor
info: hrDeviceProcessor () has status running

hrDeviceID: 0.0
hrDeviceIndex: 196610
hrDeviceStatus: running
hrDeviceType: hrDeviceProcessor
info: hrDeviceProcessor () has status running

hrDeviceID: 0.0
hrDeviceIndex: 196611
hrDeviceStatus: running
hrDeviceType: hrDeviceProcessor
info: hrDeviceProcessor () has status running

hrDeviceDescr: network interface lo
hrDeviceErrors: 0
hrDeviceID: 0.0
hrDeviceIndex: 262145
hrDeviceStatus: down
hrDeviceType: hrDeviceNetwork
info: hrDeviceNetwork (network interface lo) has status down

hrDeviceDescr: network interface eth0
hrDeviceErrors: 0
hrDeviceID: 0.0
hrDeviceIndex: 262146
hrDeviceStatus: running
hrDeviceType: hrDeviceNetwork
info: hrDeviceNetwork (network interface eth0) has status running

hrDeviceDescr: network interface eth1
hrDeviceErrors: 0
hrDeviceID: 0.0
hrDeviceIndex: 262147
hrDeviceStatus: running
hrDeviceType: hrDeviceNetwork
info: hrDeviceNetwork (network interface eth1) has status running

hrDeviceDescr: network interface eth2
hrDeviceErrors: 0
hrDeviceID: 0.0
hrDeviceIndex: 262148
hrDeviceStatus: running
hrDeviceType: hrDeviceNetwork
info: hrDeviceNetwork (network interface eth2) has status running

hrDeviceDescr: network interface eth3
hrDeviceErrors: 0
hrDeviceID: 0.0
hrDeviceIndex: 262149
hrDeviceStatus: running
hrDeviceType: hrDeviceNetwork
info: hrDeviceNetwork (network interface eth3) has status running

hrDeviceDescr: network interface sit0
hrDeviceErrors: 0
hrDeviceID: 0.0
hrDeviceIndex: 262150
hrDeviceStatus: down
hrDeviceType: hrDeviceNetwork
info: hrDeviceNetwork (network interface sit0) has status down

hrDeviceDescr: SCSI disk (/dev/sda)
hrDeviceID: 0.0
hrDeviceIndex: 393232
hrDeviceStatus: running
hrDeviceType: hrDeviceDiskStorage
info: hrDeviceDiskStorage (SCSI disk (/dev/sda)) has status running

hrDeviceDescr: Guessing that there's a floating point co-processor
hrDeviceID: 0.0
hrDeviceIndex: 786432
hrDeviceStatus: running
hrDeviceType: hrDeviceCoprocessor
info: hrDeviceCoprocessor (Guessing that there's a floating point co-processor) has status running

CRITICAL - hrDeviceNetwork (network interface lo) has status down, hrDeviceNetwork (network interface sit0) has status down, storage 35 (/dev/shm) has 100.00% free space left, storage 36 (/tmp) has 100.00% free space left, storage 37 (/run) has 100.00% free space left, storage 39 (/mnt/persistent) has 99.43% free space left, hrDeviceProcessor () has status running, hrDeviceProcessor () has status running, hrDeviceProcessor () has status running, hrDeviceProcessor () has status running, hrDeviceNetwork (network interface eth0) has status running, hrDeviceNetwork (network interface eth1) has status running, hrDeviceNetwork (network interface eth2) has status running, hrDeviceNetwork (network interface eth3) has status running, hrDeviceDiskStorage (SCSI disk (/dev/sda)) has status running, hrDeviceCoprocessor (Guessing that there's a floating point co-processor) has status running
checking storages
storage 35 (/dev/shm) has 100.00% free space left
storage 36 (/tmp) has 100.00% free space left
storage 37 (/run) has 100.00% free space left
storage 39 (/mnt/persistent) has 99.43% free space left
checking devices
hrDeviceProcessor () has status running
hrDeviceProcessor () has status running
hrDeviceProcessor () has status running
hrDeviceProcessor () has status running
hrDeviceNetwork (network interface lo) has status down
hrDeviceNetwork (network interface eth0) has status running
hrDeviceNetwork (network interface eth1) has status running
hrDeviceNetwork (network interface eth2) has status running
hrDeviceNetwork (network interface eth3) has status running
hrDeviceNetwork (network interface sit0) has status down
hrDeviceDiskStorage (SCSI disk (/dev/sda)) has status running
hrDeviceCoprocessor (Guessing that there's a floating point co-processor) has status running | '/dev/shm_free_pct'=100%;10:;5:;0;100 '/tmp_free_pct'=100.00%;10:;5:;0;100 '/run_free_pct'=100.00%;10:;5:;0;100 '/mnt/persistent_free_pct'=99.43%;10:;5:;0;100

I now try to define separate checks for CPUs, Interfaces etc and also want to check the disk-usage.

How to specify the separate disks?

I fail with:

# /usr/lib/nagios/plugins/check_nwc_health  --hostname --community secret42 --mode disk-usage  STORAGE_37  -vvvvv
info: checking disks

UNKNOWN - no disks
checking disks

# /usr/lib/nagios/plugins/check_nwc_health  --hostname --community secret42 --mode disk-usage   -vvvvv
info: checking disks

UNKNOWN - no disks
checking disks

# /usr/lib/nagios/plugins/check_nwc_health  --hostname --community secret42 --mode disk-usage "/dev/shm"  -vvvvv
info: checking disks

UNKNOWN - no disks
checking disks

pls advise, thanks

lausser commented 2 years ago

I fail with: /usr/lib/nagios/plugins/check_nwc_health --hostname --community secret42 --mode disk-usage STORAGE_37 -vvv Das ist klar, denn nirgendwo ist so eine Schreibweise beschrieben. Irgendwo da drin kommt was mit --blacklist vor, damit geht das, allerdings absolut umständlich, da man alles, was man nicht sehen will, blacklisten muss. Ansonsten ist es nicht möglich und vorgesehen, einzelne Komponenten getrennt zu überwachen.

stefangweichinger commented 2 years ago

thx. Ich werde gezielt einzelne SNMP-Checks schreiben.