Closed MohanGan closed 2 years ago
No idea tbh and i don't think it makes sense to debug the gearman version 0.33 from over 8 years ago. We updated gearman to the latest 1.1.x in OMD last year and had good results so far. But i never had the chance to update the standalone packages. Idealy while migrating them to the open suse build service to have the builds more public and opensource. If you have the chance to use the latest gearmand and compile mod-gearman against that would be my best guess for now.
Thanks @sni very for your reply on this.. can you also provide us the rpm download link for latest stable gearmand 1.1.x as im not able to find downloadable rpm package from labsconsole . Also can you let us the exact version that is working well now.. also please let us know any doc or something on how we can compile mod-gearman against gearmand package? and what modifications or things to do to compile new gearmand package
tarball for gearmand 1.1.19.1 seems to be broken, and im not able to create rpm from source code
I have tried building rpm using source code from 1.1.19.1 and tar.gz file from 1.1.18, nothing seems working for me.. can you let me know which latest version of gearmand is working for you?
Also one more thing is we are setting up this new environment in Azure cloud FYI..
Thanks in Advance @sni
We are using 1.1.19.1 mostly now. No modifations done except changing ssl path things which should be unrelated: See https://github.com/ConSol/omd/tree/labs/packages/gearmand
Hi @sni
We have deployed naemon core , gearmand and mod_gearman with below versions Server Version : Red Hat Enterprise Linux Server release 7.6 (Maipo) Naemon Core 1.2 Gearmand 0.33 mod_gearman 3.3.0
-bash-4.2$ naemon
Naemon Core 1.2.0 Copyright (c) 2013-present Naemon Core Development Team and Community Contributors Copyright (c) 2009-2013 Nagios Core Development Team and Community Contributors Copyright (c) 1999-2009 Ethan Galstad
-bash-4.2$ gearmand -V
gearmand 0.33 - https://bugs.launchpad.net/gearmand
-bash-4.2# cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.6 (Maipo) -bash-4.2# yum list installed | grep mod_gearman Failed to set locale, defaulting to C mod_gearman.x86_64 3.3.0-1.el7 @nagios-server
Regularly we are seeing very strange issues on gearmand ,neb module and even on gearman worker end.
major issues we are seeing here is all cheks are showing as orphans
below is the output from gearman_top which show neb has died and workers are showing as 0 for check_results and check_results queue job waiting is piling up and gearman workers disconnect and connect back very often and sometimes workers available for gearman workers in below screenshot will show as 0 and 1 sometimes
we are seeing below error in gearmand log
tail -f /var/log/gearmand/gearmand.log ERROR 2020-04-26 04:41:38.000000 [ proc ] Job handle plus handle count beyond GEARMAND_JOB_HANDLE_SIZE: H:hostname:11888764 -> libgearman-server/job.c
One more weird thing we have observed on gearman worker end is gearman worker threads are not releasing memory and all the available memory will be in buffer and gearman worker process never comes up during the same time after worker log rotate happens ..
[2020-05-26 04:39:05][37018][DEBUG] got host job: hostname [2020-05-26 04:39:05][37018][ERROR] worker error: gearman_worker_grab_job(GEARMAN_UNEXPECTED_PACKET) unexpected packet:ERROR -> libgearman/worker.cc:781 [2020-05-26 04:39:06][36755][ERROR] worker error: gearman_worker_grab_job(GEARMAN_UNEXPECTED_PACKET) unexpected packet:ERROR -> libgearman/worker.cc:781 [2020-05-26 04:39:06][36962][ERROR] worker error: gearman_worker_grab_job(GEARMAN_UNEXPECTED_PACKET) unexpected packet:ERROR -> libgearman/worker.cc:781
==================== Below is the neb config
############################################################################### #
Mod-Gearman - distribute checks with gearman
#
Copyright (c) 2010 Sven Nierlein
#
Mod-Gearman NEB Module Config
# ###############################################################################
debug=1
logfile=/var/log/mod_gearman/mod_gearman_neb.log server=localhost:4730 eventhandler=no notifications=no services=yes hosts=yes do_hostchecks=yes route_eventhandler_like_checks=no encryption=yes keyfile=/etc/mod_gearman/gmSecret.key use_uniq_jobs=on ############################################################################### #
NEB Module Config
#
the following settings are for the neb module only and
will be ignored by the worker.
# ############################################################################### localservicegroups=gearman_bypass result_workers=1 perfdata=no perfdata_send_all=no perfdata_mode=1 orphan_host_checks=yes orphan_service_checks=yes orphan_return=2 accept_clear_results=no
Below is the worker config
Mod-Gearman - distribute checks with gearman
Copyright (c) 2010 Sven Nierlein
Worker Module Config
identifier=hostname.dev debug=1 debug-result=yes logfile=/var/log/mod_gearman/mod_gearman_worker_hostname_4730.log
server=hostname:4730 eventhandler=no notifications=no services=yes hosts=yes encryption=yes keyfile=/etc/mod_gearman/gmSecret.key job_timeout=60 min-worker=5 max-worker=400 idle-timeout=30 max-jobs=1000 max-age=0 spawn-rate=1 fork_on_exec=no load_limit1=30 load_limit5=0 load_limit15=0 show_error_output=yes timeout_return=2 enable_embedded_perl=on use_embedded_perl_implicitly=off use_perl_cache=on p1_file=/usr/share/mod_gearman/mod_gearman_p1.pl
Workarounds
workaround_rc_25=3 orphan_return=3
Here in our environment we are using naemon core, gearmand and live status as well.
Can you help us to fix this issue? Appreciate your help on this.