sni / mod_gearman

Distribute Naemon Host/Service Checks & Eventhandler with Gearman Queues. Host/Servicegroups affinity included.
http://www.mod-gearman.org
GNU General Public License v3.0
122 stars 42 forks source link

Memory leaks in 3.0.5 and 3.1.0 (Nagios 4) #147

Closed gzalo closed 2 years ago

gzalo commented 5 years ago

Hi. I found a couple of bugs that seems to leak memory on each host/service check! If lots of checks per second are running, this can leak a couple of GBs per day and invoke the OOM killer when the server runs out of memory.

In 3.0.5, in mod_gearman.c lines 1291, 1308, 1075, 1092

#if defined(USENAEMON)
clear_volatile_macros_r(&mac);  
#endif

this line should also be executed on nagios4. The module is returning NEBERROR_CALLBACKOVERRIDE, so nagios itself isn't deallocating the memory used for the macros.

In 3.1.0, after the change to use the "INITIATE" events, it seems to leak memory in other location. Apparently something allocated by run_async_service_check is not being freed

3.1.0 leak

I think it must be the check_result *cr; variable, I'm not sure who is responsible for freeing that variable, it might be a bug of Nagios itself.

Many Thanks, Gonzalo

sni commented 5 years ago

Right, we fixed that last year in naemon: https://github.com/naemon/naemon-core/commit/27e64c9e5704571312005421a8f466851b3b7ffe Probably applies to nagios 4 as well. For the other leak, i would happily accept pull requests fixing this.

gzalo commented 5 years ago

Indeed... it's a Nagios bug quite similar to the one fixed on naemon (missing my_free(cr) in a couple of places) Unfortunately I don't think there is a way to patch gearman to free that memory itself, since Nagios crashes if you try to do it (there is no way to avoid the pointer dereference). Reported the bug in nagioscore here: https://github.com/NagiosEnterprises/nagioscore/issues/664 You can close this issue