Closed topinet closed 6 years ago
So what Neb-modules and versions are you using?
Debian 8 (amd64) naemon-livestatus 1.0.7 mod-gearman-module 3.0.5
i see. Just noticed, that the mentioned fix for mod-gearman didn't make it into a release yet. Could you try the daily mod-gearman package by any chance?
mod-gearman-module_3.0.6.20180505_debian8_amd64.deb from testing repository is ok?
I also need to update mod-gearman-workers or it's compatible with workers on 3.0.5?
Yes, and it should be sufficient to replace the new neb module only.
Done, I'll give you feedback about ram usage in a week.
After a week, RAM usage is far better than before, but RAM continues increasing slowly:
Could there still be some memory leaks?
Thanks for coming back to this. Could you by any chance have a look at vagrinds massif tool to see where most of the memory is allocated from?
At the moment, memory usage seems to be stable, using 1.2GB with 700 hosts and 4300+ services
hmm, same trend here with naemon and mod_gearman from master branch.
A slow but endless memory grow up (it is about 4 Ko / 10 sec).
Using valgrind's massif tool, i get a weird result. The culprit seems around this part of code:
->15.01% (477,080B) 0x5D6E808: strdup (in /usr/lib64/libc-2.17.so) | ->14.67% (466,345B) 0x4E61A67: nm_strdup (nm_alloc.c:42) | ->07.41% (235,475B) 0x4E8F08A: xrddefault_read_state_information (xrddefault.c:1288) | ->07.41% (235,475B) 0x4E72112: read_initial_state_information (sretention.c:106) | ->07.41% (235,475B) 0x403240: main (naemon.c:635) | |||||
---|---|---|---|---|---|---|---|---|---|
->02.85% (90,660B) 0x4E8F034: xrddefault_read_state_information (xrddefault.c:1285) | |||||||||
->02.85% (90,660B) 0x4E72112: read_initial_state_information (sretention.c:106) | |||||||||
->02.85% (90,660B) 0x403240: main (naemon.c:635) | |||||||||
->01.26% (40,192B) 0x4E8E6FC: xrddefault_read_state_information (xrddefault.c:1282) | |||||||||
->01.26% (40,192B) 0x4E72112: read_initial_state_information (sretention.c:106) | |||||||||
->01.26% (40,192B) 0x403240: main (naemon.c:635) | |||||||||
->03.15% (100,018B) in 9+ places, all below ms_print's threshold (01.00%) | |||||||||
->00.34% (10,735B) in 3+ places, all below ms_print's threshold (01.00%) |
Any idea ?
Definitely, memory consumption is stable with gearman-module 3.0.6.
great news, thanks for the heads up.
Perfect, I wanted to open a long-overdue issue on my remaining memory consumption problem, and someone else has done exactly that already :-)
Running about 7000 hosts/94000 services on debian 9 stretch.
# dpkg -l | grep -e gearm -e naemon
ii gearman-job-server 0.33-6 amd64 Job server for the Gearman distributed job queue
ii gearman-tools 0.33-6 amd64 Tools for the Gearman distributed job queue
ii libgearman7 0.33-6 amd64 Library providing Gearman client and worker functions
ii libnaemon 1.0.7 amd64 Library for Naemon - common data files
ii mod-gearman-module 3.0.5 amd64 Event broker module to distribute service checks.
ii mod-gearman-tools 3.0.5 amd64 Tools for mod-gearman
ii naemon-core 1.0.7 amd64 host/service/network monitoring and management system
ii naemon-livestatus 1.0.7 amd64 contains the Naemon livestatus eventbroker module
Been restarting naemon every day since June. Looks fine now after deploying the 3.0.6 mod_gearman_naemon.o from https://labs.consol.de/repo/testing/debian/dists/stretch/main/binary-amd64/mod-gearman-module_3.0.6.20180505_debian9_amd64.deb as suggested above.
Following https://github.com/naemon/naemon-core/issues/200, after upgrading to Naemon 1.0.7, still there are memory leaks.
In the graph, you can see ram usage before and after the upgrade, 3GB of RAM are being lost every 2 weeks, better than before at least.