centreon / centreon-plugins

Collection of standard plugins to discover and gather cloud-to-edge metrics and status across your whole IT infrastructure.
https://www.centreon.com
Apache License 2.0
310 stars 274 forks source link

[network::dell::os10::snmp::plugin] bad fan status #5273

Open jeremyit opened 2 weeks ago

jeremyit commented 2 weeks ago

Hi, I use OS10 centreon plugin to check Dell S4112 hardware status. Works fine on standard airflow. S4112 with reverse airflow fans plugin give a bad status in result of check. I think because hardware fan count is false.

Reverse airflow switch, output plugin :

UNKNOWN: fan '1' status is 'unknown' - fan '2' status is 'unknown'
'TH04Y1xxxxHDN45O123#hardware.temperature.celsius'=55C;;;; 'hardware.card.count'=1;;;; 'hardware.fan.count'=5;;;; 'hardware.fantray.count'=1;;;; 'hardware.psu.count'=2;;;; 'hardware.temperature.count'=1;;;;

output show system command on switch :
OS10#Show system
-- Fan Status --
FanTray  Status      AirFlow   Fan  Speed(rpm)  Status
----------------------------------------------------------------
1        up          REVERSE   1    11531       up
                               2    11464       up
                               3    11430       up

Standard flow switch, output plugin :

OK: All 8 components are ok [1/1 cards, 3/3 fans, 1/1 fantray, 2/2 psus, 1/1 temperatures].
'TWxxxxx46123#hardware.temperature.celsius'=51C;;;; 'hardware.card.count'=1;;;; 'hardware.fan.count'=3;;;; 'hardware.fantray.count'=1;;;; 'hardware.psu.count'=2;;;; 'hardware.temperature.count'=1;;;;

output show system command on switch :
OS10#Show system
-- Fan Status --
FanTray  Status      AirFlow   Fan  Speed(rpm)  Status
----------------------------------------------------------------
1        up          NORMAL    1    11565       up
                               2    11565       up
                               3    11364       up
jeremyit commented 2 weeks ago

To fix quickly monitoring I had to add: --filter=fan,1 --filter=fan,2 but I lost fantray

OK: All 7 components are ok [1/1 cards, 3/3 fans, 2/2 psus, 1/1 temperatures]. | 'TH0xxxxEHDNxxxx#hardware.temperature.celsius'=55C;;;; 'hardware.card.count'=1;;;; 'hardware.fan.count'=3;;;; 'hardware.psu.count'=2;;;; 'hardware.temperature.count'=1;;;;

lucie-dubrunfaut commented 2 weeks ago

Hello :)

Can you provide us with the output with the --debug option? And ideally a SNMPWalk from your equipment (anonymized) which would allow us to reproduce the situation internally and work on it?

jeremyit commented 2 weeks ago

Hello Lucie :) Thanks for reply.

First the --debug output, and then I attached the snmpwalk output.

bash-4.2$ ./centreon_plugins.pl --plugin=network::dell::os10::snmp::plugin --mode=hardware --hostname=x.X.X.X --snmp-community=frr --debug UNKNOWN: fan '1' status is 'unknown' - fan '2' status is 'unknown' | 'THxxxD0DEHDN45xxx123#hardware.temperature.celsius'=55C;;;; 'hardware.card.count'=1;;;; 'hardware.fan.count'=5;;;; 'hardware.fantray.count'=1;;;; 'hardware.psu.count'=2;;;; 'hardware.temperature.count'=1;;;; .1.3.6.1.2.1.1.1.0 = Dell SmartFabric OS10 Enterprise. Copyright (c) 1999-2024 by Dell Inc. All Rights Reserved. System Description: OS10 Enterprise. OS Version: 10.5.6.5. System Type: S4112F-ON os version: 10.5.6.5 .1.3.6.1.4.1.674.11000.5000.100.4.1.1.3.1.5.1 = THxxxD0DEHDN45xxx123 .1.3.6.1.4.1.674.11000.5000.100.4.1.1.3.1.11.1 = 55 .1.3.6.1.4.1.674.11000.5000.100.4.1.1.4.1.3.1.1 = S4112F-ON 12x10GbE, 3x100GbE Interface Module .1.3.6.1.4.1.674.11000.5000.100.4.1.1.4.1.4.1.1 = 1 .1.3.6.1.4.1.674.11000.5000.100.4.1.2.1.1.4.1 = 1 .1.3.6.1.4.1.674.11000.5000.100.4.1.2.1.1.4.2 = 1 .1.3.6.1.4.1.674.11000.5000.100.4.1.2.2.1.4.1 = 1 .1.3.6.1.4.1.674.11000.5000.100.4.1.2.3.1.7.1 = 4 .1.3.6.1.4.1.674.11000.5000.100.4.1.2.3.1.7.2 = 4 .1.3.6.1.4.1.674.11000.5000.100.4.1.2.3.1.7.3 = 1 .1.3.6.1.4.1.674.11000.5000.100.4.1.2.3.1.7.4 = 1 .1.3.6.1.4.1.674.11000.5000.100.4.1.2.3.1.7.5 = 1 checking cards card 'THxxxD0DEHDN45xxx123:S4112F-ON 12x10GbE, 3x100GbE Interface Module' status is 'ready' [instance: 1.1]. checking temperatures chassis temperature 'THxxxD0DEHDN45xxx123' is 55 degree centigrade [instance = 1] checking fans fan '1' status is 'unknown' [instance: 1]. fan '2' status is 'unknown' [instance: 2]. fan '3' status is 'up' [instance: 3]. fan '4' status is 'up' [instance: 4]. fan '5' status is 'up' [instance: 5]. checking fantray fantray '1' status is 'up' [instance: 1]. checking power supplies power supply '1' status is 'up' [instance: 1]. power supply '2' status is 'up' [instance: 2].

snmpwalkOS10.txt