ConSol-Monitoring / omd

OMD - Open Monitoring Distribution Labs Edition.
http://omd.consol.de
GNU General Public License v2.0
171 stars 35 forks source link

Upgrading to OMD 5.10 fails to start naemon #166

Closed dvnscr closed 1 year ago

dvnscr commented 1 year ago

After upgrade from OMD 5.00 to OMD 5.10 naemon did not start with an error:

/omd/sites/test/etc/init.d/naemon: line 38: /omd/sites/test/tmp/naemon/naemon.cfg: No such file or directory naemon configuration file /omd/sites/test/tmp/naemon/naemon.cfg not found. Terminating...

Had to create /omd/siste/test/tmp/naemon and /omd/site/test/tmp/naemon/{tmp,checkresults} directories by hand for it to start properly.

sni commented 1 year ago

tmp/ is a tmpfs and empty when started. If it's an empty mountpoint, omd will mount the tmpfs and create those folders. But sometimes it happens that the mountpoint is not empty and contains some folders or files already. Then omd won't mount the tmpfs over it and won't create the folders. So you could omd stop the site, run omd umount and check if tmp/ is empty.

It should look like this:

OMD[test@centos7-64]:~$ omd umount
Unmounting temporary filesystem...OK
OMD[test@centos7-64]:~$ la tmp/
total 0
drwxr-xr-x. 2 test test   6 Jul 21 08:26 ./
drwxr-xr-x. 6 test test 227 Jul 21 08:26 ../
OMD[test@centos7-64]:~$ omd start naemon
Creating temporary filesystem /omd/sites/test/tmp...OK
Starting naemon...OK
dvnscr commented 1 year ago

It worked now. I.e.:

omd stop
omd umount
omd update site
omd start

What else is failing: nsca - didn't start, had to modify etc/nsca.cfg and uncomment server_address line with ip address. log:

Server address <any> port 5667: Name or service not known

lib/monitoring-plugins/check_http now gives "segmentation fault" error.

lib/monitoring-plugins/check_http --version
check_http v2.3.3 (monitoring-plugins 2.3.3)

lib/monitoring-plugins/check_http -I myip -H mysite -u /status -S --sni -f follow
Segmentation fault

apt-get install monitoring-plugins-basic
/usr/lib/nagios/plugins/check_http --version
check_http v2.3.1 (monitoring-plugins 2.3.1)

/usr/lib/nagios/plugins/check_http -I myip -H mysite -u /status -S --sni -f follow
HTTP OK: HTTP/1.1 200 OK - 157 bytes in 1.008 second response time |time=1.007594s;;;0.000000;10.000000 size=157B;;;0
sni commented 1 year ago

could you try the latest nightly omd build to see if this still fails? Or is there a public available server which shows the same issue?

dvnscr commented 1 year ago

No public server. The problem arises if the destination server has empty body, I added -j HEAD to check command, so it won't use "GET" method, then it does not exit with segmentation fault.

sni commented 1 year ago

Sounds like this was solved already: https://github.com/monitoring-plugins/monitoring-plugins/pull/1840 i'll add the patch file here...