sni / Thruk

Thruk is a multibackend monitoring webinterface for Naemon, Nagios, Icinga and Shinken using the Livestatus API.
http://www.thruk.org
Other
408 stars 148 forks source link

Recurring downtime not preventing alerts #427

Closed Hyllegaard closed 9 years ago

Hyllegaard commented 9 years ago

Hi.

I am using the latest Consol testing repository release of Naemon, and I am experiencing problems with recurring downtime not preventing alerts. They are shown in the GUI downtime

But as can be seen in the event log it does not seem to work: [2014-12-07 02:33:46] HOST ALERT: dksgs400;DOWN;SOFT;1;CRITICAL - Host Unreachable (192.x.x.x)

Is there supposed to be an event logged when an item enters scheduled downtime?

Unfortunately I am not able to debug this myself, but if there is anything I can do to help, please let me know.

_UPDATE_ I just tried creating a normal scheduled downtime, and an entry was logged in the event log. These do not appear for the recurring downtimes.

Regards

Jens Hyllegaard

sni commented 9 years ago

There will be still Host Alerts in der Logfiles, only Notifications will be suppressed. However, you should see the active downtime during the backup time.

Hyllegaard commented 9 years ago

Hi.

Sorry if I failed to express myself clearly enough. What I meant was that notifications are still being sent. I understand that the events are logged. Here is a screenshot of the notification log. notifications

Regards

Jens

sni commented 9 years ago

ok, do you see any cronjobs for the thruk user? Could be the webserver user too.

Hyllegaard commented 9 years ago

I was just looking into that. The command: crontab -u www-data -l shows this: THIS PART IS WRITTEN BY THRUK, CHANGES WILL BE OVERWRITTEN ############################################################## downtimes 0 2 * * 0 cd /usr/share/naemon && /bin/bash -l -c '/usr/bin/thruk -a downtimetask="1"' >/dev/null 2>>/var/lib/naemon/thruk/cron.log 0 2 * * 0 cd /usr/share/naemon && /bin/bash -l -c '/usr/bin/thruk -a downtimetask="2"' >/dev/null 2>>/var/lib/naemon/thruk/cron.log ############################################################## END OF THRUK

I just checked the contents of /var/lib/naemon/thruk/cron.log and it contained: failed failed failed failed

sni commented 9 years ago

you could run those commands with -v and --local to get some more details.

Hyllegaard commented 9 years ago

If i run: /bin/bash -l -c '/usr/bin/thruk -v -a downtimetask="1"' >/dev/null 2>>/var/lib/naemon/thruk/cron.log I get this:

[Wed Dec 10 23:28:11 2014][DEBUG] reading secret file: /var/lib/naemon/thruk/secret.key [Wed Dec 10 23:28:11 2014][DEBUG] _run(): $VAR1 = { [Wed Dec 10 23:28:11 2014][DEBUG] 'url' => [], [Wed Dec 10 23:28:11 2014][DEBUG] 'listbackends' => undef, [Wed Dec 10 23:28:11 2014][DEBUG] 'start' => undef, [Wed Dec 10 23:28:11 2014][DEBUG] 'verbose' => 2, [Wed Dec 10 23:28:11 2014][DEBUG] 'yes' => undef, [Wed Dec 10 23:28:11 2014][DEBUG] 'help' => undef, [Wed Dec 10 23:28:11 2014][DEBUG] 'remoteurl_specified' => 0, [Wed Dec 10 23:28:11 2014][DEBUG] 'auth' => undef, [Wed Dec 10 23:28:11 2014][DEBUG] 'backends' => [], [Wed Dec 10 23:28:11 2014][DEBUG] 'credential' => '5bef873c61440c0be86ba537920f100e', [Wed Dec 10 23:28:11 2014][DEBUG] 'quiet' => undef, [Wed Dec 10 23:28:11 2014][DEBUG] 'local' => undef, [Wed Dec 10 23:28:11 2014][DEBUG] 'force' => undef, [Wed Dec 10 23:28:11 2014][DEBUG] 'action' => 'downtimetask=1', [Wed Dec 10 23:28:11 2014][DEBUG] 'all_inclusive' => undef, [Wed Dec 10 23:28:11 2014][DEBUG] 'remoteurl' => 'http://localhost/naemon/cgi-bin/remote.cgi', [Wed Dec 10 23:28:11 2014][DEBUG] 'version' => undef [Wed Dec 10 23:28:11 2014][DEBUG] }; [Wed Dec 10 23:28:11 2014][DEBUG] _request(http://localhost/naemon/cgi-bin/remote.cgi) [Wed Dec 10 23:28:11 2014][DEBUG] -> success [Wed Dec 10 23:28:11 2014][DEBUG] -> $VAR1 = bless( { [Wed Dec 10 23:28:11 2014][DEBUG] '_content' => '{"output":"failed\n","rc":1}', [Wed Dec 10 23:28:11 2014][DEBUG] '_headers' => bless( { [Wed Dec 10 23:28:11 2014][DEBUG] 'date' => 'Wed, 10 Dec 2014 22:28:11 GMT', [Wed Dec 10 23:28:11 2014][DEBUG] 'client-peer' => '127.0.0.1:80', [Wed Dec 10 23:28:11 2014][DEBUG] 'connection' => 'close', [Wed Dec 10 23:28:11 2014][DEBUG] 'content-type' => 'text/html; charset=utf-8', [Wed Dec 10 23:28:11 2014][DEBUG] 'client-response-num' => 1, [Wed Dec 10 23:28:11 2014][DEBUG] '::std_case' => { [Wed Dec 10 23:28:11 2014][DEBUG] 'client-date' => 'Client-Date', [Wed Dec 10 23:28:11 2014][DEBUG] 'client-peer' => 'Client-Peer', [Wed Dec 10 23:28:11 2014][DEBUG] 'client-response-num' => 'Client-Response-Num' [Wed Dec 10 23:28:11 2014][DEBUG] }, [Wed Dec 10 23:28:11 2014][DEBUG] 'client-date' => 'Wed, 10 Dec 2014 22:28:11 GMT', [Wed Dec 10 23:28:11 2014][DEBUG] 'vary' => 'Accept-Encoding', [Wed Dec 10 23:28:11 2014][DEBUG] 'content-length' => '28', [Wed Dec 10 23:28:11 2014][DEBUG] 'server' => 'Apache/2.4.7 (Ubuntu)' [Wed Dec 10 23:28:11 2014][DEBUG] }, 'HTTP::Headers' ), [Wed Dec 10 23:28:11 2014][DEBUG] '_protocol' => 'HTTP/1.1', [Wed Dec 10 23:28:11 2014][DEBUG] '_msg' => 'OK', [Wed Dec 10 23:28:11 2014][DEBUG] '_request' => bless( { [Wed Dec 10 23:28:11 2014][DEBUG] '_headers' => bless( { [Wed Dec 10 23:28:11 2014][DEBUG] 'content-length' => 593, [Wed Dec 10 23:28:11 2014][DEBUG] 'user-agent' => 'thruk_cli', [Wed Dec 10 23:28:11 2014][DEBUG] 'content-type' => 'application/x-www-form-urlencoded' [Wed Dec 10 23:28:11 2014][DEBUG] }, 'HTTP::Headers' ), [Wed Dec 10 23:28:11 2014][DEBUG] '_content' => 'data=%7B%22options%22%3A%7B%22url%22%3A%5B%5D%2C%22listbackends%22%3Anull%2C%22start%22%3Anull%2C%22verbose%22%3A2%2C%22yes%22%3Anull%2C%22help%22%3Anull%2C%22remoteurl_specified%22%3A0%2C%22auth%22%3Anull%2C%22backends%22%3A%5B%5D%2C%22credential%22%3A%225bef873c61440c0be86ba537920f100e%22%2C%22quiet%22%3Anull%2C%22local%22%3Anull%2C%22force%22%3Anull%2C%22action%22%3A%22downtimetask%3D1%22%2C%22all_inclusive%22%3Anull%2C%22remoteurl%22%3A%22http%3A%2F%2Flocalhost%2Fnaemon%2Fcgi-bin%2Fremote.cgi%22%2C%22version%22%3Anull%7D%2C%22credential%22%3A%225bef873c61440c0be86ba537920f100e%22%7D', [Wed Dec 10 23:28:11 2014][DEBUG] '_method' => 'POST', [Wed Dec 10 23:28:11 2014][DEBUG] '_uri_canonical' => bless( do{(my $o = 'http://localhost/naemon/cgi-bin/remote.cgi')}, 'URI::http' ), [Wed Dec 10 23:28:11 2014][DEBUG] '_uri' => $VAR1->{'_request'}{'_uri_canonical'} [Wed Dec 10 23:28:11 2014][DEBUG] }, 'HTTP::Request' ), [Wed Dec 10 23:28:11 2014][DEBUG] '_rc' => 200 [Wed Dec 10 23:28:11 2014][DEBUG] }, 'HTTP::Response' ); [Wed Dec 10 23:28:11 2014][DEBUG] -> $VAR1 = { [Wed Dec 10 23:28:11 2014][DEBUG] 'rc' => 1, [Wed Dec 10 23:28:11 2014][DEBUG] 'output' => 'failed [Wed Dec 10 23:28:11 2014][DEBUG] ' [Wed Dec 10 23:28:11 2014][DEBUG] }; failed

I if run this: /bin/bash -l -c '/usr/bin/thruk --local -v -a downtimetask="1"' >/dev/null 2>>/var/lib/naemon/thruk/cron.log

[Wed Dec 10 23:29:36 2014][DEBUG] reading secret file: /var/lib/naemon/thruk/secret.key [Wed Dec 10 23:29:36 2014][DEBUG] _run(): $VAR1 = { [Wed Dec 10 23:29:36 2014][DEBUG] 'url' => [], [Wed Dec 10 23:29:36 2014][DEBUG] 'version' => undef, [Wed Dec 10 23:29:36 2014][DEBUG] 'auth' => undef, [Wed Dec 10 23:29:36 2014][DEBUG] 'verbose' => 2, [Wed Dec 10 23:29:36 2014][DEBUG] 'start' => undef, [Wed Dec 10 23:29:36 2014][DEBUG] 'credential' => '5bef873c61440c0be86ba537920f100e', [Wed Dec 10 23:29:36 2014][DEBUG] 'action' => 'downtimetask=1', [Wed Dec 10 23:29:36 2014][DEBUG] 'yes' => undef, [Wed Dec 10 23:29:36 2014][DEBUG] 'local' => 1, [Wed Dec 10 23:29:36 2014][DEBUG] 'help' => undef, [Wed Dec 10 23:29:36 2014][DEBUG] 'remoteurl' => 'http://localhost/naemon/cgi-bin/remote.cgi', [Wed Dec 10 23:29:36 2014][DEBUG] 'quiet' => undef, [Wed Dec 10 23:29:36 2014][DEBUG] 'all_inclusive' => undef, [Wed Dec 10 23:29:36 2014][DEBUG] 'listbackends' => undef, [Wed Dec 10 23:29:36 2014][DEBUG] 'force' => undef, [Wed Dec 10 23:29:36 2014][DEBUG] 'backends' => [], [Wed Dec 10 23:29:36 2014][DEBUG] 'remoteurl_specified' => 0 [Wed Dec 10 23:29:36 2014][DEBUG] }; [Wed Dec 10 23:29:36 2014][DEBUG] _dummy_c() [Wed Dec 10 23:29:39 2014][DEBUG] _dummy_c() done [Wed Dec 10 23:29:39 2014][DEBUG] _from_local() [Wed Dec 10 23:29:39 2014][DEBUG] $VAR1 = { [Wed Dec 10 23:29:39 2014][DEBUG] 'code' => 404, [Wed Dec 10 23:29:39 2014][DEBUG] 'headers' => { [Wed Dec 10 23:29:39 2014][DEBUG] 'Content-Length' => 14, [Wed Dec 10 23:29:39 2014][DEBUG] 'Content-Type' => 'text/html; charset=utf-8' [Wed Dec 10 23:29:39 2014][DEBUG] }, [Wed Dec 10 23:29:39 2014][DEBUG] 'result' => 'Page not found' [Wed Dec 10 23:29:39 2014][DEBUG] }; failed

Hyllegaard commented 9 years ago

Just to be sure i ran a: curl http://localhost/naemon/cgi-bin/remote.cgi and got OK as a response.

sni commented 9 years ago

Could you try the referenced patch. Thats the only idea i have so far.

Hyllegaard commented 9 years ago

That did it :)

Here is the result from running with both --local and -v [Thu Dec 11 00:03:18 2014][DEBUG] reading secret file: /var/lib/naemon/thruk/secret.key [Thu Dec 11 00:03:18 2014][DEBUG] _run(): $VAR1 = { [Thu Dec 11 00:03:18 2014][DEBUG] 'action' => 'downtimetask=1', [Thu Dec 11 00:03:18 2014][DEBUG] 'credential' => '5bef873c61440c0be86ba537920f100e', [Thu Dec 11 00:03:18 2014][DEBUG] 'backends' => [], [Thu Dec 11 00:03:18 2014][DEBUG] 'auth' => undef, [Thu Dec 11 00:03:18 2014][DEBUG] 'remoteurl_specified' => 0, [Thu Dec 11 00:03:18 2014][DEBUG] 'start' => undef, [Thu Dec 11 00:03:18 2014][DEBUG] 'version' => undef, [Thu Dec 11 00:03:18 2014][DEBUG] 'all_inclusive' => undef, [Thu Dec 11 00:03:18 2014][DEBUG] 'yes' => undef, [Thu Dec 11 00:03:18 2014][DEBUG] 'help' => undef, [Thu Dec 11 00:03:18 2014][DEBUG] 'listbackends' => undef, [Thu Dec 11 00:03:18 2014][DEBUG] 'quiet' => undef, [Thu Dec 11 00:03:18 2014][DEBUG] 'force' => undef, [Thu Dec 11 00:03:18 2014][DEBUG] 'verbose' => 2, [Thu Dec 11 00:03:18 2014][DEBUG] 'url' => [], [Thu Dec 11 00:03:18 2014][DEBUG] 'remoteurl' => 'http://localhost/naemon/cgi-bin/remote.cgi', [Thu Dec 11 00:03:18 2014][DEBUG] 'local' => 1 [Thu Dec 11 00:03:18 2014][DEBUG] }; [Thu Dec 11 00:03:18 2014][DEBUG] _dummy_c() [Thu Dec 11 00:03:20 2014][DEBUG] _dummy_c() done [Thu Dec 11 00:03:20 2014][DEBUG] _from_local() [info] [(cron)][] cmd: COMMAND [1418252600] SCHEDULE_HOST_DOWNTIME;dksgs400;1418252600;1418259800;1;0;0;(cron);Down for backup

I can also see it in the interface, and there was an event logged.

Thank you so much for your help.

Regards

Jens

sni commented 9 years ago

you're welcome