tarohida / mk_livestatus

Livestatus is a tool to access the host and service status of your Nagios server.
0 stars 0 forks source link

Nagios 4.5.1 (as systemd service) with livestatus clash with exit code 254 #1

Open tarohida opened 5 months ago

tarohida commented 5 months ago

Same as below. https://support.nagios.com/forum/viewtopic.php?p=356262

I don't know how it occurs and how to solve it.

It works.

# printf "GET hosts\nColumns: name\n" | /usr/local/bin/unixcat /usr/local/nagios/var/rw/live 
localhost

But it doesn't work and nagios.service failed with exit code 254.

[root@mysv ~]# printf "GET services\n" | /usr/local/bin/unixcat /usr/local/nagios/var/rw/live
[root@mysv ~]# echo $?
0
[root@mysv ~]# systemctl status nagios
× nagios.service - Nagios Core 4.5.1
     Loaded: loaded (/usr/lib/systemd/system/nagios.service; enabled; preset: disabled)
     Active: failed (Result: exit-code) since Tue 2024-04-23 16:29:49 UTC; 5s ago
   Duration: 13min 51.805s
       Docs: https://www.nagios.org/documentation
    Process: 53714 ExecStartPre=/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
    Process: 53715 ExecStart=/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg (code=exited, status=0/SUCCESS)
    Process: 54232 ExecStopPost=/bin/rm -f /usr/local/nagios/var/rw/nagios.cmd (code=exited, status=0/SUCCESS)
   Main PID: 53716 (code=exited, status=254)
        CPU: 356ms

Apr 23 16:24:27 mysv nagios[53716]: livestatus: Timeperiod cache not updated, there are no timeperiods (yet)
Apr 23 16:25:27 mysv nagios[53716]: livestatus: Timeperiod cache not updated, there are no timeperiods (yet)
Apr 23 16:26:27 mysv nagios[53716]: livestatus: Timeperiod cache not updated, there are no timeperiods (yet)
Apr 23 16:27:37 mysv nagios[53716]: livestatus: Timeperiod cache not updated, there are no timeperiods (yet)
Apr 23 16:28:37 mysv nagios[53716]: livestatus: Timeperiod cache not updated, there are no timeperiods (yet)
Apr 23 16:29:37 mysv nagios[53716]: livestatus: Timeperiod cache not updated, there are no timeperiods (yet)
Apr 23 16:29:49 mysv nagios[53716]: Caught SIGSEGV, shutting down...
Apr 23 16:29:49 mysv systemd[1]: nagios.service: Main process exited, code=exited, status=254/n/a
Apr 23 16:29:49 mysv nagios[53721]: Caught SIGTERM, shutting down...
Apr 23 16:29:49 mysv systemd[1]: nagios.service: Failed with result 'exit-code'.
tarohida commented 5 months ago

With Nagios Core 4.4.14, it works.

# printf "GET services\n" | /usr/local/bin/unixcat /usr/local/nagios/var/rw/live
accept_passive_checks;acknowledged;acknowledgement_type;action_url;action_url_expanded;active_checks_enabled;cache_interval; ...

# systemctl status nagios
● nagios.service - Nagios Core 4.4.14
     Loaded: loaded (/usr/lib/systemd/system/nagios.service; enabled; preset: d>
     Active: active (running) since Fri 2024-04-26 15:38:11 UTC; 2h 24min ago
...
Yu6936 commented 4 months ago

What happens if you recompile the livestatus binaries with Nagios 4.5.1?

tarohida commented 4 months ago

Thank you for comment. I recompile the livestatus binaries with Nagios 4.5.2, It looks work. Nagios service didn't crash. I think problem was solved. I will fix it.

# printf "GET services\nColumns: description\n" | /usr/local/bin/unixcat /usr/local/nagios/var/rw/live
Current Load
Current Users
HTTP
PING
Root Partition
SSH
Swap Usage
Total Processes

# printf "GET services\n" | /usr/local/bin/unixcat /usr/local/nagios/var/rw/live
accept_passive_checks;acknowledged;acknowledgement_type;action_url;action_url_expanded;active_checks_enabled; ...(trimmed)