riskersen / Monitoring

Monitoring plugins wich are Nagios/icinga compatible
65 stars 112 forks source link

Wrong memory usage on Fortigate 900D, 60F, 60E #59

Closed dkolasinski closed 2 years ago

dkolasinski commented 3 years ago

I have wrong memory usage on: Fortigate 900D, 60F, 60E (6.4.5). It does not match memory usage returned by dashboard widget.

I`ve checked SNMP OIDS, and it looks like .1.3.6.1.4.1.12356.101.13.2.1.1.4 is being used. This oid is decribed as "Memory usage of the specified cluster member (percentage)" and it returns multiple values. But non of them match "usage" returned by dashboard widget.

So I`ve started to dig for details: root@x:/etc/icinga2/bin# snmpwalk [censored] .1.3.6.1.4.1.12356.101.13.2.1.1.4 iso.3.6.1.4.1.12356.101.13.2.1.1.4.1 = Gauge32: 56 /<- master/ iso.3.6.1.4.1.12356.101.13.2.1.1.4.2 = Gauge32: 28

System output: Memory: 16449968k total, 5812716k used (35.3%), 7223252k free (43.9%), 3414000k freeable (20.8%)

So it looks like that .1.3.6.1.4.1.12356.101.13.2.1.1.4 returns total used memory including freeable memory (system buffers/cache I guess). 35+21=56

Further, I have found OID 12356.101.4.1.4.0, with returns valid value. But it does not work per cluster member.

root@x:/etc/icinga2/bin# snmpwalk [censored] 10.10.0.1 1.3.6.1.4.1.12356.101.4.1.4.0 iso.3.6.1.4.1.12356.101.4.1.4.0 = Gauge32: 35

Maybe we should add second check for single unit? Im using this check to not to hit "conserve mode" so Im interested only in not freeable memory usage.

What do you think? Please verify.

riskersen commented 3 years ago

I see, so the oid returns used memory, but includes freeable memory?

I woud not break existing checks and thresholds, so your idea about adding a new check option is pretty good.

Would you mind to add something like "mem-non-freeable" ? I currently don't have access to Fortigates.

VincentROLLAND commented 3 years ago

Hi, First great job for this check. I notice the same mismatch between the GUI memory usage and the snmp an a Fortigate 301E (A/P cluster, 6.0.6, upgrade is planned). On the GUI i have 52% memory usage and the check return 64%.

snmpwalk [options] .1.3.6.1.4.1.12356.101.13.2.1.1.4 SNMPv2-SMI::enterprises.12356.101.13.2.1.1.4.1 = Gauge32: 64 SNMPv2-SMI::enterprises.12356.101.13.2.1.1.4.2 = Gauge32: 45

As this KB, the return value is the same than GUI:

https://kb.fortinet.com/kb/viewContent.do?externalId=FD30084

snmpwalk [options] .1.3.6.1.4.1.12356.101.4.1.4.0 SNMPv2-SMI::enterprises.12356.101.4.1.4.0 = Gauge32: 54

ReneRiech commented 2 years ago

Hi,

I have the same problem, our Dashboard shows a lower value of memory than the check gives us. My Perl knowledge is pretty bad, so my first try was just to copy line 180: my $oid_mem = ".1.3.6.1.4.1.12356.101.13.2.1.1.4"; # Location of cluster member Mem (%) to my $oid_mem = ".1.3.6.1.4.1.12356.101.4.1.4.0"; # Location of cluster member Mem (%) and commenting the previous line 180 out. But when I try my check: -bash-4.2$ ./check_fortigate_new.pl '-C' 'secret '-H' 'secret' '-T' 'mem' UNKNOWN: OID .1.3.6.1.4.1.12356.101.4.1.4.0.1 does not exist

I don't understand where the .1 comes from. Is it be added somehow later in the script?

edit: Found the adding of the .1 later in the "sub get_health_value" section

A snmpwalk is okay, though, with .1.3.6.1.4.1.12356.101.4.1.4.0 -bash-4.2$ snmpwalk -v2c -c 'secret' 'secret' 1.3.6.1.4.1.12356.101.4.1.4.0 SNMPv2-SMI::enterprises.12356.101.4.1.4.0 = Gauge32: 71

so... it should work in the check_fortigate.pl script too, or?

Any ideas?

Greetings,

René

yaiqsa commented 2 years ago

Hi,

I have the same problem, our Dashboard shows a lower value of memory than the check gives us. My Perl knowledge is pretty bad, so my first try was just to copy line 180: my $oid_mem = ".1.3.6.1.4.1.12356.101.13.2.1.1.4"; # Location of cluster member Mem (%) to my $oid_mem = ".1.3.6.1.4.1.12356.101.4.1.4.0"; # Location of cluster member Mem (%) and commenting the previous line 180 out. But when I try my check: -bash-4.2$ ./check_fortigate_new.pl '-C' 'secret '-H' 'secret' '-T' 'mem' UNKNOWN: OID .1.3.6.1.4.1.12356.101.4.1.4.0.1 does not exist

I don't understand where the .1 comes from. Is it be added somehow later in the script?

edit: Found the adding of the .1 later in the "sub get_health_value" section

A snmpwalk is okay, though, with .1.3.6.1.4.1.12356.101.4.1.4.0 -bash-4.2$ snmpwalk -v2c -c 'secret' 'secret' 1.3.6.1.4.1.12356.101.4.1.4.0 SNMPv2-SMI::enterprises.12356.101.4.1.4.0 = Gauge32: 71

so... it should work in the check_fortigate.pl script too, or?

Any ideas?

Greetings,

René

Hi, I came across the same isuse, and wanted to fix it for myself as well. I too, have no perl experience whatsoever, but managed to get it to work in a 'it works for me' kinda fashion. So please, don't be mad at me if this breaks all other kinds of stuff, and feel free to correct me 😅

What I did:

For me this works as intended, but I realize that this is pretty ugly and unmaintainable, and might break the script for other fortigate versions.

ReneRiech commented 2 years ago

Okay, I've created a little patch for this and made a pull request :) But I think it should need a deep review.

VincentROLLAND commented 2 years ago

Hi, thanks for your work. Personaly for my fortigate (301E) the serial start with FG3H1E, so the OID fork doesn't work on my environement. Perhaps i suggest to make an option switch.

sgruber94 commented 2 years ago

How do you like the Option switch? Which Versions are affected from this OID?

Maybe we got several Firewall Types, and we can try to implement that in one function.

ReneRiech commented 2 years ago

Hi,

either an option switch, or another when-block, starting after line #318. Something like:

} when ( /^FG3H1E/ ) { # FG3H1E given ( lc($type) ) { when ("mem") { ($return_state, $return_string) = get_healthvalue($oid???_mem, "Memory", "%"); } when ("cpu") { ($return_state, $return_string) = get_healthvalue($oid???_cpu, "CPU", "%"); } when ("ses") { ($return_state, $return_string) = get_healthvalue($oid???_ses, "Session", ""); }

and more lines with the right oid number. How would you option switch work?

Greetings,

René

VincentROLLAND commented 2 years ago

I suggest option switch of the script like -mem-legacy or -mem-ha or -mem-new... if you don't want to break the backward compatibility. With the serial it will be difficult to be exhaustive, and i don't think that it's link to serial number The fortinet KB indicate this OID .1.3.6.1.4.1.12356.101.4.1.4.0 for the memory, that return the same value than GUI for my installation (Fortigate 301E (A/P cluster, 6.0.6))

sgruber94 commented 2 years ago

Hi together, I checked the OIDs again and discover some behavior, between both OIDs one comes from HA Info and the other from sys info. As a result, I added following checks mem-sys / cpu-sys /ses-ipv4 / ses-ipv6 and comment out @ReneRiech Code (thanks for

@VincentROLLAND @ReneRiech I added the changes in PR #72

Have a nice Weekend,

sgruber94 commented 2 years ago

@VincentROLLAND @ReneRiech
Did you already check the script and tested it? I cleaned up the "draft" parts, and it's ready for reviewing / merging.

VincentROLLAND commented 2 years ago

Hi, sorry for the delay. i have try the new script but i have a mistake

./check_fortigate.pl -H A.B.C.D -C [COMUNITY] -p [FORTISERIAL_PATH] -T mem-sys UNKNOWN: OID .1.3.6.1.4.1.12356.101.4.1.4.0.1 does not exist

The script add the .1 for the master and .2 for the slave, but this OID doesn't exist