centreon / centreon-plugins

Collection of standard plugins to discover and gather cloud-to-edge metrics and status across your whole IT infrastructure.
https://www.centreon.com
Apache License 2.0
312 stars 275 forks source link

[hardware::server::hp::ilo::xmlapi::plugin] multiple calls to the "--warning" option in "hardware" mode #5035

Open Aleksey-Maksimov opened 6 months ago

Aleksey-Maksimov commented 6 months ago

Hello

We use multiple calls to the "--warning" option in "hardware" mode for "hardware::server::hp::ilo::xmlapi::plugin". Perfdata are formed correctly. That is, in perfdata we see the main threshold for all sensors ('temperature,.*,5:55') and individual sensors have their own specific thresholds (01-Inlet Ambient, 27-HD Controller, 28-LOM Card).

However, the problem is that the main output of the plugin is not generated correctly. In this example, the plugin output should be "OK" and not "WARNING".

# ./centreon_plugins.pl --plugin=hardware::server::hp::ilo::xmlapi::plugin 
--mode=hardware 
--change-perfdata '.*,,eval(%(label) =~ s/#hardware//g)' 
--filter-perfdata-adv 'not (%(label) =~ /\.(count)/)' 
--filter-uom 'null' 
--hostname 'ilo125.holding.com' --password 'passw0rd' --username 'monitor'
--ssl-opt 'SSL_verify_mode => SSL_VERIFY_NONE' --timeout '60' 
--use-new-perfdata 
--warning 'temperature,.*,5:55' 
--warning 'fan,.*,5:95' 
--warning 'temperature,01-Inlet Ambient,5:37' 
--warning 'temperature,27-HD Controller,5:70' 
--warning 'temperature,28-LOM Card,5:70'

WARNING: Temperature '27-HD Controller' is 56 C - Temperature '28-LOM Card' is 56 C | 
'Fan 1.fan.speed.percentage'=45;5:95;;0; 
'Fan 2.fan.speed.percentage'=45;5:95;;0; 
'Fan 3.fan.speed.percentage'=45;5:95;;0; 
'Fan 4.fan.speed.percentage'=45;5:95;;0; 
'Fan 5.fan.speed.percentage'=45;5:95;;0; 
'Fan 6.fan.speed.percentage'=45;5:95;;0; 
'01-Inlet Ambient.temperature.celsius'=19;5:37;;; 
'02-CPU 1.temperature.celsius'=40;5:55;;; 
'03-CPU 2.temperature.celsius'=40;5:55;;; 
'04-P1 DIMM 1-6.temperature.celsius'=32;5:55;;; 
'05-P1 DIMM 7-12.temperature.celsius'=34;5:55;;; 
'06-P2 DIMM 1-6.temperature.celsius'=29;5:55;;; 
'07-P2 DIMM 7-12.temperature.celsius'=32;5:55;;; 
'08-HD Max.temperature.celsius'=35;5:55;;; 
'10-Chipset.temperature.celsius'=40;5:55;;; 
'11-PS 1 Inlet.temperature.celsius'=23;5:55;;; 
'12-PS 2 Inlet.temperature.celsius'=31;5:55;;; 
'13-VR P1.temperature.celsius'=41;5:55;;; 
'14-VR P2.temperature.celsius'=39;5:55;;; 
'15-VR P1 Mem.temperature.celsius'=30;5:55;;; 
'16-VR P1 Mem.temperature.celsius'=33;5:55;;; 
'17-VR P2 Mem.temperature.celsius'=28;5:55;;; 
'18-VR P2 Mem.temperature.celsius'=31;5:55;;; 
'19-PS 1 Internal.temperature.celsius'=40;5:55;;; 
'20-PS 2 Internal.temperature.celsius'=40;5:55;;; 
'27-HD Controller.temperature.celsius'=56;5:70;;; 
'28-LOM Card.temperature.celsius'=56;5:70;;; 
'29-LOM.temperature.celsius'=47;5:55;;; 
'30-Front Ambient.temperature.celsius'=28;5:55;;; 
'31-PCI 1 Zone..temperature.celsius'=32;5:55;;; 
'32-PCI 2 Zone..temperature.celsius'=32;5:55;;; 
'33-PCI 3 Zone..temperature.celsius'=32;5:55;;; 
'37-HD Cntlr Zone.temperature.celsius'=37;5:55;;; 
'38-I/O Zone.temperature.celsius'=33;5:55;;; 
'39-P/S 2 Zone.temperature.celsius'=33;5:55;;; 
'40-Battery Zone.temperature.celsius'=32;5:55;;; 
'41-iLO Zone.temperature.celsius'=36;5:55;;; 
'43-Storage Batt.temperature.celsius'=24;5:55;;; 
'44-Fuse.temperature.celsius'=31;5:55;;;
lucie-dubrunfaut commented 4 months ago

Hello :)

I think there is a conflict because you are defining specific thresholds and a generic threshold and it seems that the plugin does not handle an exclusion for the specific thresholds defined when using a generic threshold in parallel. I think you can either use a generic threshold for all your temperatures (--warning 'temperature,.*,5:55') or use specific thresholds (using regex) but not both at the same time. In any case that doesn't seem to be the philosophy with which the plugin is currently built.

Aleksey-Maksimov commented 4 months ago

Hello.

Talk about philosophy here is nothing more than just words. I ask you to read my original message carefully. Please note that thresholds in perfdata are generated correctly when both general and specific thresholds are used. The plugin understands this. This means that these thresholds are logically parsed by the plugin correctly. But there is a logical error in getting the main return code of the plugin (WARNING instead of OK).

Please look at this again carefully and understand that the plugin's main return code contradicts the perfdata.

WARNING: Temperature '27-HD Controller' is 56 C - Temperature '28-LOM Card' is 56 C |
...
'27-HD Controller.temperature.celsius'=56;5:70;;;
'28-LOM Card.temperature.celsius'=56;5:70;;;
...
lucie-dubrunfaut commented 4 months ago

I used the term philosophy to describe how the plugin has its logic built. The way the plugin is designed is not a standard way of defining thresholds which is why I cannot determine at first glance why there is a crossover between perfdata results and the short output. I'm afraid we need some data to be able to troubleshoot this issue. Can you provide us the curl command and json output linked to it to help us troubleshooting this issue?

Aleksey-Maksimov commented 4 months ago

Yes, sure. Tell me exactly what is required of me.

lucie-dubrunfaut commented 4 months ago

Hello :)

Ideally to work on the subject we need the curls commands and JSON returns from the API calls involved in your issue. This would allow us to resimulate your environment and work to understand and resolve this issue more easily.

Aleksey-Maksimov commented 4 months ago

I ran the command in debug mode and saved the command output to a file. You can see requests and responses there.

./centreon_plugins.pl --plugin=hardware::server::hp::ilo::xmlapi::plugin --mode=hardware --change-perfdata '.*,,eval(%(label) =~ s/#hardware//g)' --filter-perfdata-adv 'not (%(label) =~ /\.(count)/)' --filter-uom 'null' --hostname 'in-ilo002.holding.com' --password 'pwd' --username 'monitor' --ssl-opt 'SSL_verify_mode => SSL_VERIFY_NONE' --timeout '60' --use-new-perfdata --warning 'temperature,.*,5:55' --warning 'fan,.*,5:95' --warning 'temperature,01-Inlet Ambient,5:37' --warning 'temperature,27-HD Controller,5:70' --warning 'temperature,28-LOM Card,5:70' --verbose --debug >> /tmp/HPE-ProLiant-DL380-Gen9-In-VM02.txt

You can download the file from the link: https://disk.yandex.ru/d/5zIZyIfbJHsHYw

Please let me know after you download the file. I'll delete it.

lucie-dubrunfaut commented 4 months ago

Good (and thank you), you can delete the link, the data are no linked to the internal dev ticket in our backlog.