Closed r-lindner closed 1 year ago
I'll have a look.
I'm seeing the same thing. Post upgrade to mod_gearman-4.0.4, RAM usage grows until it exhausts. restart clears it up, but growth begins again. rinse and repeat.
RHEL7 server.
it's probably this one: https://github.com/naemon/naemon-core/pull/404
Could you try the latest nightly naemon build to see if that helped. I don't see any leaks anymore, regardsless of using mod-gearman or not.
sorry, I was out of office for some time... the nightly naemon from the 2022-12-17 had no problems in the last 2 hours, looks good.
I was wrong, it went OOM again. I had looked at the memory consumption of the wrong server. :-( I am rolling back the module to adc47f0c again.
We have also been affected by this after upgrading mod-gearman to 4.0.3. We experienced some oom kills in our instance and upon checking, the gearman-job-server was the culprit.
I think the issue in https://github.com/naemon/naemon-core/pull/404 is unrelated to this, as the leak is not in Naemon itself, but in the gearman-job-server process.
The current mod-gearman and Naemon still has the memory leak :-(
tested and not working: mod-gearman 5.0.1 + Naemon 1.4.0 mod-gearman 5.0.1 + Naemon 1.4.0-1 (nightly naemon 2022-12-17) mod-gearman 5.0.1 + Naemon 1.4.1
My last working version is still mod-gearman 4.0.2 adc47f0c, no matter which Naemon version I use.
Cannot reproduce this so far. Is it the naemon process which is growing?
I have gearman-job-server (and nothing else) on a separate server where other servers (mod-gearman-worker, pnp-gearman, mod-gearman-module) are connecting to. As soon as I install mod-gearman-module > 4.0.2 adc47f0 and restart the naemon process, the RAM+Swap usage on the gearman-job-server host is going up up up.
and which gearmand version is that?
I tried 1.1.19.1+ds-2+b2 (Debian 11) and 0.33-8 (Consol)
I ran into the same issue with 1.1.18+ds-3+b3 on Debian 10.
i've seen machines with high geamand memory usage but if i restart the service, memory usage is stable again (at least as long as i watched) and i still couldn't reproduce this behaviour i a lab. Does the memory usage increase linear and directly after restarting gearmand?
Last week:
omd reload in crontab each day at 1 pm. If no reload, memory usage raises up to 2 Go / week.
Focus just after 1 pm:
So next to 1 pm and during 2 hours, memory usage is kind of flat.
gearmand 1.1.20, omd 5.10.
jfr
Same here. Gearmand: v1.1.19.1 Package: 5.00-labs-edition OS: Debian 11 For me, it is 10G in two days and the gearmand service starts to be unresponsive until restart (in most cases restart does not happen nicely so have to kill it forcefully)
i see, that's a good point. So i was too impatient... Did run valgrind massif here and got similar results now:
indeed, seems like 1868be43e61fd12ade8221ea8ad19a8df83df742 introduced this issue.
I guess it's the call to
gearman_client_add_options( client, GEARMAN_CLIENT_NON_BLOCKING|GEARMAN_CLIENT_FREE_TASKS|GEARMAN_CLIENT_UNBUFFERED_RESULT);
which results in gearmand misbehaviour.
Let's see how this can be solved...
i switched back to blocking io. This seems to fix the memory issue in gearmand. I'll run some tests to see if this has any performance impacts. So far it looks promising.
Hello, I upgraded mod-gearman-module from 4.0.2 to 4.0.4 some days ago and noticed the SWAP on my gearman-job-server node went up and up until it was OOM. It is around 120MB per hour in my case. I tested the commits between 4.0.2 and 4.0.3:
I also tried disabling SWAP (in the first 2 cm of the image) but then RAM usage goes up. The RAM / SWAP on the naemon node where mod-gearman-module is installed does not change.