slauger / check_netscaler

A Nagios Plugin written in Perl for the Citrix ADC (formerly Citrix NetScaler). It uses the NetScaler NITRO API.
35 stars 18 forks source link

Wrong CPUusagepcnt in 13.0 #87

Closed GalipoliX closed 2 years ago

GalipoliX commented 3 years ago

After upgrading to version 13.0 Build 82.45 the value of cpuusegepcnt seems to be wrong.

Output of Plugin: NetScaler CRITICAL - above: system.rescpuusagepcnt is above threshold (current: 4294967295, critical: 90); system.mgmtcpuusagepcnt: 0.2 | 'system.rescpuusagepcnt'=4294967295;80;90 'system.mgmtcpuusagepcnt'=0.2;80;90

Any idea ?

slauger commented 3 years ago

Hi @GalipoliX ,

first of all, are you sure that you want to monitor rescpuusagepcnt? It represents the avarage cpu usage.

Bildschirmfoto 2021-07-29 um 21 59 09

I never used rescpuusagepcnt, but i have the same behaviour on my VPX. So maybe this is a bug in the current firmware.

./check_netscaler.pl -H 10.0.0.240 -p supersecurepassword -C above -o system -n cpuusagepcnt,mgmtcpuusagepcnt -w 75 -c 80 -vvv | grep cpu
[extra-opts] check_netscaler -H 10.0.0.240 -p supersecurepassword -C above -o system -n cpuusagepcnt,mgmtcpuusagepcnt -w 75 -c 80 -vvv
$VAR1 = '{ "errorcode": 0, "message": "Done", "severity": "NONE", "system": { "voltagev12n": 0.000000, "voltagev5n": 0.000000, "cpuusage": "1", "rescpuusage": "4294967295", "slavecpuusage": "4294967295", "mastercpuusage": "4294967295", "auxvolt7": 0.000000, "auxvolt6": 0.000000, "auxvolt5": 0.000000, "auxvolt4": 0.000000, "auxvolt3": 0.000000, "auxvolt2": 0.000000, "auxvolt1": 0.000000, "auxvolt0": 0.000000, "voltagevsen2": 0.000000, "voltagev5sb": 0.000000, "voltagevtt": 0.000000, "voltagevbat": 0.000000, "voltagev12p": 0.000000, "voltagev5p": 0.000000, "voltagev33stby": 0.000000, "voltagev33main": 0.000000, "voltagevcc1": 0.000000, "voltagevcc0": 0.000000, "numcpus": "1", "memusagepcnt": 31.591588, "memuseinmb": "342", "addimgmtcpuusagepcnt": 0.000000, "mgmtcpu0usagepcnt": 0.700000, "mgmtcpuusagepcnt": 0.700000, "pktcpuusagepcnt": 1.200000, "cpuusagepcnt": 1.200000, "rescpuusagepcnt": 4294967295.000000, "starttimelocal": "Thu Jul 29 21:46:31 2021", "starttime": "Thu Jul 29 19:46:31 2021", "disk0perusage": 50, "disk1perusage": 43, "cpufan0speed": 0, "cpufan1speed": 0, "systemfanspeed": 0, "fan0speed": 0, "fanspeed": 0, "cpu0temp": 0, "cpu1temp": 0, "internaltemp": 0, "powersupply1status": "NOT SUPPORTED", "powersupply2status": "NOT SUPPORTED", "powersupply3status": "NOT SUPPORTED", "powersupply4status": "NOT SUPPORTED", "disk0size": 1585, "disk0used": 743, "disk0avail": 715, "disk1size": 14179, "disk1used": 5696, "disk1avail": 7348, "fan2speed": 0, "fan3speed": 0, "fan4speed": 0, "fan5speed": 0, "auxtemp0": 0, "auxtemp1": 0, "auxtemp2": 0, "auxtemp3": 0, "timesincestart": "00:00:00", "memsizemb": "0" } }';
                        'rescpuusage' => '4294967295',
                        'cpufan1speed' => 0,
                        'numcpus' => '1',
                        'slavecpuusage' => '4294967295',
                        'addimgmtcpuusagepcnt' => '0',
                        'mastercpuusage' => '4294967295',
                        'cpu0temp' => 0,
                        'pktcpuusagepcnt' => '1.2',
                        'cpuusagepcnt' => '1.2',
                        'cpufan0speed' => 0,
                        'cpu1temp' => 0,
                        'mgmtcpu0usagepcnt' => '0.7',
                        'cpuusage' => '1',
                        'rescpuusagepcnt' => '4294967295',
                        'mgmtcpuusagepcnt' => '0.7'
NetScaler OK - above: system.cpuusagepcnt: 1.2; system.mgmtcpuusagepcnt: 0.7 | 'system.cpuusagepcnt'=1.2;75;80 'system.mgmtcpuusagepcnt'=0.7;75;80
> sh version
    NetScaler NS13.0: Build 82.45.nc, Date: Jul 16 2021, 09:58:49   (64-bit)
GalipoliX commented 3 years ago

I opened a support ticket at Citrix. This is the answer: "This is actually a known issue that still under review. unfortunately I can not share the bug ID as this is for internal use only. Also, I don't have information on a workaround or date for the next firmware release. "

slauger commented 3 years ago

Thanks for sharing!

The last time i filled a bug for the Citrix ADC (critical bug in the NTLM implementation which caused a crash/service restart!) it took about 12+ months until the fix was available in a new firmware release. Don't expect a quick solution here.

If you want to have more options for alerting and analyzing the metrics i would suggest to have a look at the Citrix ADC Exporter for Prometheus. This one uses the NITRO API, but then you could calculate average metrics by your own.

https://github.com/citrix/citrix-adc-metrics-exporter