bluecmd / fortigate_exporter

Prometheus exporter for Fortigate firewalls
GNU General Public License v3.0
229 stars 71 forks source link

httpsd memory leak in FortiOS 6.2.7 #62

Closed bluecmd closed 3 years ago

bluecmd commented 3 years ago

On my home Fortigate, a -61F running v6.2.7 build1190 (GA). I am entering conserve mode after ~12hr of polling at 15s intervals. I am not observing this on v6.4.4 build1803 (GA) on a Fortigate-VM server.

fortigate (global) # get sys perf stat
[ ... ]
Memory: 1964908k total, 1140648k used (58.1%), 616532k free (31.4%), 207728k freeable (10.5%)
fortigate (global) # diagnose sys top
Run Time:  0 days, 6 hours and 40 minutes
0U, 0N, 0S, 100I, 0WA, 0HI, 0SI, 0ST; 1918T, 602F
          httpsd      250      S       0.9     6.0
          httpsd      259      S       0.9     5.9
          httpsd      262      S       0.0     6.3
          httpsd      246      S       0.0     6.3
         cmdbsvr      150      S       0.0     2.1
         reportd      196      S       0.0     2.0
       ipshelper      213      S <     0.0     1.8
          cw_acd      218      S       0.0     1.7
         miglogd      177      S       0.0     1.5
         miglogd      242      S       0.0     1.4
         miglogd      240      S       0.0     1.4
         miglogd      239      S       0.0     1.4
[ ... ]
fortigate (global) # diag hard sysinfo memory
MemTotal:        1964908 kB
MemFree:          615768 kB
Buffers:           32632 kB
Cached:           276476 kB
SwapCached:            0 kB
Active:           723672 kB
Inactive:         133808 kB
Active(anon):     665168 kB
Inactive(anon):    22048 kB
Active(file):      58504 kB
Inactive(file):   111760 kB
Unevictable:       89536 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:               380 kB
Writeback:             0 kB
AnonPages:        637980 kB
Mapped:            72888 kB
Shmem:             49292 kB
Slab:             195440 kB
SReclaimable:      17536 kB
SUnreclaim:       177904 kB
KernelStack:        2464 kB
PageTables:        16228 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:      982452 kB
Committed_AS:   24775892 kB
VmallocTotal:   260046784 kB
VmallocUsed:       84992 kB
VmallocChunk:   259953168 kB

I then restart httpsd like this: fnsysctl killall httpsd. The result is:

fortigate (global) # get sys perf stat
[ ... ]
Memory: 1964908k total, 776172k used (39.5%), 979792k free (49.9%), 208944k freeable (10.6%)
fortigate (global) # diagnose sys top
Run Time:  0 days, 6 hours and 49 minutes
0U, 0N, 0S, 100I, 0WA, 0HI, 0SI, 0ST; 1918T, 955F
         miglogd      240      S       0.4     1.4
        dnsproxy      217      S       0.4     0.9
         cmdbsvr      150      S       0.0     2.1
         reportd      196      S       0.0     1.9
       ipshelper      213      S <     0.0     1.8
          cw_acd      218      S       0.0     1.7
          httpsd     2235      S       0.0     1.5
         miglogd      177      S       0.0     1.5
         miglogd      242      S       0.0     1.4
         miglogd      243      S       0.0     1.4
         miglogd      239      S       0.0     1.4
          httpsd     2237      S       0.0     1.4
          httpsd     2236      S       0.0     1.3
          httpsd     2234      S       0.0     1.2
[ ... ]
fortigate (global) # diag hard sysinfo memory
MemTotal:        1964908 kB
MemFree:          980336 kB
Buffers:           32972 kB
Cached:           277820 kB
SwapCached:            0 kB
Active:           358048 kB
Inactive:         134784 kB
Active(anon):     299280 kB
Inactive(anon):    21972 kB
Active(file):      58768 kB
Inactive(file):   112812 kB
Unevictable:       89536 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:               488 kB
Writeback:             0 kB
AnonPages:        271596 kB
Mapped:            73024 kB
Shmem:             49660 kB
Slab:             196140 kB
SReclaimable:      17604 kB
SUnreclaim:       178536 kB
KernelStack:        2496 kB
PageTables:        15772 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:      982452 kB
Committed_AS:   25006324 kB
VmallocTotal:   260046784 kB
VmallocUsed:       84992 kB
VmallocChunk:   259949872 kB

Log for httpsd in debug level -1 attached but I cannot seem to detect any odd things. httpsd.log

Given that I am not observing this leak on my Fortigate-VM instance, I am thinking this memory leak is probably one of the following:

I should be able to upgrade to 6.4.4 soon enough and give that a shot on my 61F. I might also be able to spin up a Fortigate-VM 6.2.7 and see if that has the same behavior.

I have not reported this issue to Fortinet as of yet.

Maybe interesting is that the memory usage is reported to belong to the main VDOM which is my traffic forwarding VDOM, not the management one (root).

image

bluecmd commented 3 years ago

For now I have configured a hourly restart of httpsd to contain the issue on this particular Fortigate:

config system automation-action
  edit "restart httpsd for mem leaks"
    set action-type cli-script
    set required enable
    set script "fnsysctl killall httpsd"
  next
end

config system automation-stitch
  edit "restart httpsd for mem leaks"
    set trigger "restart httpsd for mem leaks"
    set action "restart httpsd for mem leaks"
  next
end

config system automation-trigger
  edit "restart httpsd for mem leaks"
    set trigger-type scheduled
    set trigger-frequency hourly
    set trigger-minute 13
  next
end
secustor commented 3 years ago

From my point of view this can only be a Fortigate issue.
So I'm not sure about the bug label. Maybe a device or environment label would be better fit here. 🤔

bluecmd commented 3 years ago

I created a Fortigate-issue label for these things.

bluecmd commented 3 years ago

I have verified this behavior on Fortigate-VM 6.2.7 as well.

I ran: while true; do curl 'localhost:9710/probe?target=https://a-fortivm' > /dev/null; done

image

It seems FortiOS was able to reclaim some memory at times but in the end it stopped responding to HTTPS requests. On the console I see this: image

On the events view I see this: image

I am not able to log in on the local console

I will repeat the experiment using 6.4.5 on Fortigate-VM.

bluecmd commented 3 years ago

Using 6.4.5 this issue seems to have disappeared:

image

bluecmd commented 3 years ago

Upgraded my -61F from 6.2.7 to 6.4.5 and disabled the workaround. Memory usage has been stable for the entire day.

Suggest we create a Known Issues part of the README where we do not recommend running the exporter without adding the workaround for versions below 6.4.x

WDYT @secustor ?

secustor commented 3 years ago

Yeah, that is for sure a good idea.

secustor commented 3 years ago

Closing this, as we have documented this now.

amitkatti commented 3 years ago

Noticed it in Fortigate 100F also. Exclusively using SNMP for now.

image

bluecmd commented 3 years ago

@amitkatti If you feel like you have the time, submitting a case to Fortigate would be appreciated. Otherwise you should be fine on 6.4.x or with the workaround posted above.

amitkatti commented 3 years ago

I will open a ticket if we decide not to go with the upgrade.

LucaSchildi commented 1 year ago

Just FYI - We still have the same issue with 7.0.12 - It seems to be a problem with the API access in general. That's what Fortinet submitted to our tickets at least.