Icinga / icinga-core

Icinga 1.x, the old core (EOL 31.12.2018)
GNU General Public License v2.0
44 stars 28 forks source link

[dev.icinga.com #778] Unable to stop executing checks // some host checks get skipped if service checks are disabled #371

Closed icinga-migration closed 14 years ago

icinga-migration commented 14 years ago

This issue has been migrated from Redmine: https://dev.icinga.com/issues/778

Created by mfriedrich on 2010-09-10 18:14:55 +00:00

Assignee: mfriedrich Status: Resolved (closed on 2010-09-21 13:05:21 +00:00) Target Version: 1.2 (Stable) Last Update: 2010-09-21 13:05:21 +00:00 (in Redmine)


http://sourceforge.net/mailarchive/message.php?msg\_name=4C8A14D8.3030903%40consol.de

-------- Original Message --------
Subject:    [Nagios-devel] Unable to stop executing checks with 3.2.2
Date:   Fri, 10 Sep 2010 13:22:00 +0200
From:   Sven Nierlein 
Reply-To:   Nagios Developers List 
To:     nagios-devel@lists.sourceforge.net

 Hi,

It seems like with 3.2.2 the
execute_service_checks=0
and
execute_host_checks=0
setting has no effect anymore.

Its easy to reproduce. Install a clean 3.2.2 tarball with the default config and set both values to 0.
Then start nagios and after a few moments it starts some checks. The webinterface still displays:
Service Checks Being Executed?  NO
Host Checks Being Executed?  NO 

I had a quick look at the patches since 3.2.1 but could not find the reason.

This is the interesting part from the nagios.debug logfile:

[1284124694.798033] [008.1] [pid=9046] ** Event Check Loop
[1284124694.798203] [008.1] [pid=9046] Next High Priority Event Time: Fri Sep 10 15:18:18 2010
[1284124694.798261] [008.1] [pid=9046] Next Low Priority Event Time:  Fri Sep 10 15:18:15 2010
[1284124694.798307] [008.1] [pid=9046] Current/Max Service Checks: 0/0
[1284124694.798349] [008.2] [pid=9046] No events to execute at the moment.  Idling for a bit...
[1284124694.798393] [001.0] [pid=9046] check_for_external_commands()
[1284124694.798442] [064.1] [pid=9046] Making callbacks (type 8)...
[1284124695.048890] [008.1] [pid=9046] ** Event Check Loop
[1284124695.048988] [008.1] [pid=9046] Next High Priority Event Time: Fri Sep 10 15:18:18 2010
[1284124695.049016] [008.1] [pid=9046] Next Low Priority Event Time:  Fri Sep 10 15:18:15 2010
[1284124695.049038] [008.1] [pid=9046] Current/Max Service Checks: 0/0
[1284124695.049060] [024.1] [pid=9046] We're not executing service checks right now, so we'll skip this event.
[1284124695.049081] [001.0] [pid=9046] remove_event()
[1284124695.049104] [064.1] [pid=9046] Making callbacks (type 8)...
[1284124695.049125] [001.0] [pid=9046] reschedule_event()
[1284124695.049178] [001.0] [pid=9046] add_event()
[1284124695.049201] [064.1] [pid=9046] Making callbacks (type 8)...
[1284124695.049224] [064.1] [pid=9046] Making callbacks (type 20)...
[1284124695.049245] [008.1] [pid=9046] Running event...
[1284124695.049266] [001.0] [pid=9046] handle_timed_event() start
[1284124695.049288] [064.1] [pid=9046] Making callbacks (type 8)...
[1284124695.049317] [008.0] [pid=9046] ** Timed Event ** Type: 0, Run Time: Fri Sep 10 15:18:53 2010
[1284124695.049342] [008.0] [pid=9046] ** Service Check Event ==> Host: 'localhost', Service: 'Current Users', Options: 0, Latency: -37.951000 sec
[1284124695.049371] [001.0] [pid=9046] run_scheduled_service_check() start
[1284124695.049392] [016.0] [pid=9046] Attempting to run scheduled check of service 'Current Users' on host 'localhost': check options=0, latency=-37.951000
[1284124695.049442] [001.0] [pid=9046] run_async_service_check()
[1284124695.049476] [001.0] [pid=9046] check_service_check_viability()
[1284124695.049509] [001.0] [pid=9046] check_time_against_period()
[1284124695.049538] [001.0] [pid=9046] check_service_dependencies()
[1284124695.049584] [064.1] [pid=9046] Making callbacks (type 13)...
[1284124695.049607] [016.0] [pid=9046] Checking service 'Current Users' on host 'localhost'...
[1284124695.049656] [001.0] [pid=9046] get_raw_command_line()

Regards,
 Sven

-------- Original Message --------
Subject:    Re: [Nagios-devel] Unable to stop executing checks with 3.2.2
Date:   Fri, 10 Sep 2010 14:51:02 +0200
From:   Sven Nierlein 
Reply-To:   Nagios Developers List 
To:     Nagios Developers List 

 On 09/10/10 14:42, Ton Voon wrote:
> See my email re: "test failure with latest code". My guess is that  
> line referenced in that email is the one causing the problem.
>
Ah yes, i remember now. I didn't thought its related to this problem.

> I changed the max_parallel_service_checks to 2 and svc2 is still not  
> getting rescheduled.
>
> So can anyone confirm if either the current code:
>   * causes breakage or
>   * means that the test is now invalid and svc2->next_check should ==  
> now

Ok, then i can confirm that current code breaks disabling active host/service checks and the
test is still valid.

 Sven

-------- Original Message --------
Subject:    Re: [Nagios-devel] Unable to stop executing checks with 3.2.2
Date:   Fri, 10 Sep 2010 14:21:50 +0100
From:   Ton Voon 
Reply-To:   Nagios Developers List 
To:     Nagios Developers List 

On 10 Sep 2010, at 13:51, Sven Nierlein wrote:

>> I changed the max_parallel_service_checks to 2 and svc2 is still not
>> getting rescheduled.
>>
>> So can anyone confirm if either the current code:
>>  * causes breakage or
>>  * means that the test is now invalid and svc2->next_check should ==
>> now
>
> Ok, then i can confirm that current code breaks disabling active  
> host/service checks and the
> test is still valid.

I think this is sufficient to remove the change. Done in CVS now. The  
test passes on my system now, so will see what the other tinerbox  
builds say tomorrow.

I'm interested to know what Ethan thinks about the change.

Ton

From: Ton Voon 
Date: Fri, 10 Sep 2010 13:19:23 +0000 (+0000)
Subject: Fixes problem where disabling all active hosts/services was not taking effect
X-Git-Url: http://git.nagiosprojects.org/?p=nagios.git;a=commitdiff_plain;h=565616e80e4b0449df6c91d28ef6ec964aa6ca2e;hp=c668bf1a7ea5387375c50123e4f4d26de3c92bdc

Fixes problem where disabling all active hosts/services was not taking effect
---

diff --git a/Changelog b/Changelog
index 349b007..4df6153 100644
--- a/Changelog
+++ b/Changelog
@@ -3,6 +3,16 @@ Nagios 3.x Change Log
 #####################

+3.2.3 - ??????????
+------------------
+ENHANCEMENTS
+
+FIXES
+* Fixes problem where disabling all active hosts/services was not taking effect
+
+WARNINGS
+
+
 3.2.2 - 09/01/2010
 ------------------
 ENHANCEMENTS
@@ -43,8 +53,6 @@ FIXES
 * Fix to sample notify-service-by-email command (bug #62)
 * Fix for retaining host display name and alias, as well as service display name (Folkert van Heusden)

-WARNINGS
-

 3.2.1 - 03/09/2010
diff --git a/base/events.c b/base/events.c
index e037fa4..0c9a91b 100644
--- a/base/events.c
+++ b/base/events.c
@@ -1084,8 +1084,11 @@ int event_execution_loop(void){

                    }

+           /* Removed. Reports that if you disable all service checks, service checks continue to be run. Also
+              t-tap/test_events will fail if this flag is set - TV
+           */
            /* reset flag */
-           run_event=TRUE;
+           /* run_event=TRUE; */

            /* run a few checks before executing a host check... */
            if(event_list_low->event_type==EVENT_HOST_CHECK){

Attachments

Changesets

2010-09-21 11:39:21 +00:00 by mfriedrich 7bfd6520a55a7e7125dd642268ad9cd8b372429a

don't skip hostchecks when servicechecks are disabled (Stephen Gran) #778

please check the issues for detailed information.
code comment has been added too explaining old and new logic.

refs #581
fixes #778

Relations:

icinga-migration commented 14 years ago

Updated by mfriedrich on 2010-09-14 07:59:44 +00:00

removed #581 - needs a more sophisticated fix/patch with proper testing.

icinga-migration commented 14 years ago

Updated by mfriedrich on 2010-09-15 09:25:24 +00:00

[10:25:57]  i was thinking about http://tracker.nagios.org/view.php?id=152 a bit

[10:27:03]  setting run_event to true again when a servicecheck gets re-scheduled should be better
[10:27:15]  so that a hostcheck afterwards will get executed
[10:27:50]  state machine fidling is always funny
[10:29:58]  the patch was the reason why you cannot disable active checks at the moment?
[10:30:20]  yep
[10:30:28]  i think so
[10:30:47]  its reverted in cvs already, but i think the release was a bit too fast
[10:30:48]  because run_event is always true in the middle of that

[10:31:54]  the states are servicecheck => hostcheck
[10:32:09]  if servicechecks are disabled, they get rescheduled, and run_event remains false
[10:32:40]  it stays like that for upcoming hostcheck, which will then also being rescheduled isntead of being checked in case

[10:36:46]  if servicecheck was set, and even if it was set to run_event=false, this causes a reschedule of the check and removing from event queue
[10:36:55]  afterwards, run_event is set to true,
[10:37:04]  the hostcheck is being skipped
[10:37:11]  as it's a servicecheck
[10:37:21]  run_event==TRUE matches
[10:37:27]  so it gets executed
[10:37:43]  base/events.c:1133
[10:38:12]  so no matter what you do in the servicecheck condition it will always get executed
[10:38:19]  hilarous fuckup
[10:38:27]  i guess, noone will reject a patch ;)
[10:38:43]  probably not
[10:39:15]  need to check on that though
[10:41:58]  setting run_event to true when rescheduling is also bad because it then hits directly the run_event condition

i would consider a logical rewrite of the state machine in order to run events straight from host or servicechecks, removing if(run_event==TRUE){

icinga-migration commented 14 years ago

Updated by mfriedrich on 2010-09-18 11:38:18 +00:00

icinga-migration commented 14 years ago

Updated by mfriedrich on 2010-09-20 08:22:49 +00:00

icinga-migration commented 14 years ago

Updated by mfriedrich on 2010-09-20 08:24:57 +00:00

-------- Original Message --------
Subject:    Re: [Nagios-devel] Unable to stop executing checks with 3.2.2
Date:   Sun, 19 Sep 2010 11:53:08 +0200
From:   Michael Friedrich 
Reply-To:   Nagios Developers List 
To:     nagios-devel@lists.sourceforge.net

Hi,

On 2010-09-18 19:57, Ton Voon wrote:
>
> The test_events.c passes with this change. However, it would be best
> if a testcase could be written for the problem this is trying to
> solve, which fails without the patch and passes with the patch. This
> will ensure the problem will continue to get visibility in future.
>
> If you create that testcase, I'd be more than happy to apply.
>    

The patch addresses one of the proposed ways to fix the problem like 
mentioned in the issue - http://tracker.nagios.org/view.php?id=152

"Possible solutions would be to reset run_events to TRUE after removing an
event from the queue, reset run_events to TRUE upon entering each "if"
block, or make the "if" into an "if else"."

The firstly committed patch had nothing to do with setting run_event the 
proper way and resolving the issue. It was a bit screwed and misses some 
logical explanation why it was done this way.

Either way, Stephen's solution will work as it breaks up the state 
machine and makes sure changing the queue also affects the execution 
during the while(1) loop.

Take this event queue

hc1 hc2 sc1 hc3 sc2 hc4 sc3 sc4 sc5

servicechecks for sc2 are disabled.

Looping the events happens like this, following the algorithm of the 
original source:

1/ hc1 triggered, no servicecheck condition matched. run_event==TRUE, 
hc1 gets executed in case not being disabled
2/ hc2 same as hc1, run_event is reset to TRUE at the beginning of the 
loop with low prio event
3/ sc1 will be found during checking for servicecheck event. sc1 
run_event will be TRUE as the checks are not disabled. also it won't get 
removed from the queue. so

event_list_low->event_type==EVENT_HOST_CHECK

will not match afterwards.

4/ hc4 will be executed as it was like the rest above.
5/ sc2 is our special service check, which is being disabled. run_event starts with TRUE, then detection of service check event is true, but servicechecks are disabled.
So run_event changes to FALSE, and the event gets re-eschuled, removed from current event queue. Next is the check for hostcheck event, because it's a same priority "IF" and not next-time "ELSE IF".
In this case, hc4 is already visible to the queue checker, and the condition for a hostcheck event MATCHES, but having run_event on FALSE from before.

So in this case we won't have 6/ with hc4, but this happens all in 5/

hc4 is being checked for run_event==FALSE (which originates from sc2), and heureka, this is FALSE and hc4 as a host check event does not get executed too - but re-scheduled!

===

By taking Stephen's Patch, the mentioned ELSE IF will match whether service OR host check event. Currently it's an AND for matching.
Having one looping for those check events will make sure, they both won't disturb each other in detecting if the are

- disabled
- should be reschuled
- should be run

 From my point of view, the logic can be rewritten too by removing that run_event flag and just putting the execution calls to where run_event was TRUE previously. But I haven't tried this time as there is more testing needed.
I would consider Stephen's patch as the patch for the buggy patch.

Kind regards,
Michael

-------- Original Message --------
Subject:    Re: [Nagios-devel] Unable to stop executing checks with 3.2.2
Date:   Sun, 19 Sep 2010 11:51:24 +0100
From:   Stephen Gran 
Reply-To:   Nagios Developers List 
To:     nagios-devel@lists.sourceforge.net

On Sat, Sep 18, 2010 at 06:57:54PM +0100, Ton Voon said:
> The test_events.c passes with this change. However, it would be best  
> if a testcase could be written for the problem this is trying to  
> solve, which fails without the patch and passes with the patch. This  
> will ensure the problem will continue to get visibility in future.
> 
> If you create that testcase, I'd be more than happy to apply.

So I've been looking at the various test cases you already have, and I'm
not sure if it's going to be simpler to try to plug it into test_events.c
or create a new one, so I'm going to outline my thoughts and ask for a
recommendation.

The things that this patch addresses and should be tested are (in pseudo
code):

if execute_service_checks == 1 and execute_host_checks == 1:
    failed service_check schedules a host_check and executes it
    service_check is rescheduled

else if execute_service_checks == 0 and execute_host_checks == 1:
    service_check is rescheduled instead of executed
    host_check is scheduled at it's normal interval and executed

else if execute_service_checks == 1 and execute_host_checks == 0:
    failed service_check schedules a host_check
    host_check is rescheduled instead of run
    service_check is rescheduled

else if execute_service_checks == 0 and execute_host_checks == 0:
    a service_check is rescheduled
    a host_check is rescheduled

That covers my understanding of how the scheduling should work.

Any advice on where to put that?  I admit at first glance I am slightly
daunted by the prospect of learning your test harness for the logic
above, but I'm willing to try.

Cheers,

-------- Original Message --------
Subject:    Re: [Nagios-devel] Unable to stop executing checks with 3.2.2
Date:   Mon, 20 Sep 2010 08:51:32 +0100
From:   Ton Voon 
Reply-To:   Nagios Developers List 
To:     Nagios Developers List 

On 19 Sep 2010, at 11:51, Stephen Gran wrote:

> On Sat, Sep 18, 2010 at 06:57:54PM +0100, Ton Voon said:
>> The test_events.c passes with this change. However, it would be best
>> if a testcase could be written for the problem this is trying to
>> solve, which fails without the patch and passes with the patch. This
>> will ensure the problem will continue to get visibility in future.
>>
>> If you create that testcase, I'd be more than happy to apply.
>
> So I've been looking at the various test cases you already have, and  
> I'm
> not sure if it's going to be simpler to try to plug it into  
> test_events.c
> or create a new one

I don't mind as long as the test is written in a way that can be rerun  
to confirm success in TAP output.

More docs at http://wiki.nagios.org/index.php/Nagios_Core_Developer_Guidelines#Testing

However, I'm guessing you'll find it easier to add more tests to  
test_events.c than to create a whole new test_events2.c.

> , so I'm going to outline my thoughts and ask for a
> recommendation.
>
> The things that this patch addresses and should be tested are (in  
> pseudo
> code):
>
> if execute_service_checks == 1 and execute_host_checks == 1:
>    failed service_check schedules a host_check and executes it
>    service_check is rescheduled
>
> else if execute_service_checks == 0 and execute_host_checks == 1:
>    service_check is rescheduled instead of executed
>    host_check is scheduled at it's normal interval and executed
>
> else if execute_service_checks == 1 and execute_host_checks == 0:
>    failed service_check schedules a host_check
>    host_check is rescheduled instead of run
>    service_check is rescheduled
>
> else if execute_service_checks == 0 and execute_host_checks == 0:
>    a service_check is rescheduled
>    a host_check is rescheduled
>
> That covers my understanding of how the scheduling should work.

This looks about right though, to be honest, I've never delved deeply  
into this logic so I can't verify.

>
> Any advice on where to put that?  I admit at first glance I am  
> slightly
> daunted by the prospect of learning your test harness for the logic
> above, but I'm willing to try.

The first thing is: can you get the test_events C code to execute  
without failure?

The 2nd is to code your scenarios above. The tricky thing here is that  
you have to setup all the appropriate variables, then execute  
event_execution_loop(). As you success criteria is based on whether  
certain events were executed or not, you could probably work this out  
by checking the last_check and next_check for specific host/services  
afterwards.

I look forward to see those tests!

Ton
icinga-migration commented 14 years ago

Updated by mfriedrich on 2010-09-20 15:32:53 +00:00

i've added a bit more debugoutput especially for this case in order to see when host and servicecheck happen by name (same loop cycle)

and, there's the result with currently, unpatched release:

[1284996509.248159] [008.1] [pid=26787] ** Event Check Loop
[1284996509.248166] [008.1] [pid=26787] Next High Priority Event Time: Mon Sep 20 17:28:34 2010
[1284996509.248171] [008.1] [pid=26787] Next Low Priority Event Time:  Mon Sep 20 17:28:29 2010
[1284996509.248174] [008.1] [pid=26787] Current/Max Service Checks: 0/0
[1284996509.248179] [024.1] [pid=26787] Run a few checks before executing a service check for 'test_ok_16'.
[1284996509.248182] [024.1] [pid=26787] We're not executing service checks right now, so we'll skip this event.
[1284996509.248185] [024.1] [pid=26787] Skip event, removing service 'test_ok_16' from list.
[1284996509.248231] [024.1] [pid=26787] Run a few checks before executing a host check for 'test_host_071'.
[1284996509.248234] [024.1] [pid=26787] Skip event, removing host 'test_host_071' from list.
[1284996509.248307] [008.1] [pid=26787] ** Event Check Loop
[1284996509.248314] [008.1] [pid=26787] Next High Priority Event Time: Mon Sep 20 17:28:34 2010
[1284996509.248319] [008.1] [pid=26787] Next Low Priority Event Time:  Mon Sep 20 17:28:57 2010
[1284996509.248322] [008.1] [pid=26787] Current/Max Service Checks: 0/0

test_ok_16 service is disabled globally, being rescheduled and removed from event queue. afterwards, it should go straight into next looping since nothing more to be done. but it doesn't next in line is test_host_071 which gets skipped too.

icinga-migration commented 14 years ago

Updated by mfriedrich on 2010-09-21 06:16:14 +00:00

with the patch being applied, the event looping is doing just fine.

[1285049507.073382] [008.1] [pid=421] Next High Priority Event Time: Tue Sep 21 08:11:49 2010
[1285049507.073390] [008.1] [pid=421] Next Low Priority Event Time:  Tue Sep 21 08:11:47 2010
[1285049507.073396] [008.1] [pid=421] Current/Max Service Checks: 0/0
[1285049507.073403] [024.1] [pid=421] Run a few checks before executing a service check for 'test_ok_19'.
[1285049507.073409] [024.1] [pid=421] We're not executing service checks right now, so we'll skip this event.
[1285049507.073415] [024.1] [pid=421] Skip event, removing service 'test_ok_19' from list.
[1285049507.073501] [008.1] [pid=421] ** Event Check Loop
[1285049507.073509] [008.1] [pid=421] Next High Priority Event Time: Tue Sep 21 08:11:49 2010
[1285049507.073519] [008.1] [pid=421] Next Low Priority Event Time:  Tue Sep 21 08:11:47 2010
[1285049507.073524] [008.1] [pid=421] Current/Max Service Checks: 0/0
[1285049507.073532] [024.1] [pid=421] Run a few checks before executing a service check for 'test_ok_19'.
[1285049507.073538] [024.1] [pid=421] We're not executing service checks right now, so we'll skip this event.
[1285049507.073544] [024.1] [pid=421] Skip event, removing service 'test_ok_19' from list.
[1285049507.073653] [008.1] [pid=421] ** Event Check Loop
[1285049507.073662] [008.1] [pid=421] Next High Priority Event Time: Tue Sep 21 08:11:49 2010
[1285049507.073671] [008.1] [pid=421] Next Low Priority Event Time:  Tue Sep 21 08:11:49 2010
[1285049507.073677] [008.1] [pid=421] Current/Max Service Checks: 0/0
[1285049507.325743] [008.1] [pid=421] ** Event Check Loop
[1285049507.325816] [008.1] [pid=421] Next High Priority Event Time: Tue Sep 21 08:11:49 2010
[1285049507.325829] [008.1] [pid=421] Next Low Priority Event Time:  Tue Sep 21 08:11:49 2010
[1285049507.325864] [008.1] [pid=421] Current/Max Service Checks: 0/0
[1285049507.577854] [008.1] [pid=421] ** Event Check Loop
[1285049507.577936] [008.1] [pid=421] Next High Priority Event Time: Tue Sep 21 08:11:49 2010
[1285049507.577951] [008.1] [pid=421] Next Low Priority Event Time:  Tue Sep 21 08:11:49 2010
[1285049507.577958] [008.1] [pid=421] Current/Max Service Checks: 0/0
[1285049507.829890] [008.1] [pid=421] ** Event Check Loop
[1285049507.829957] [008.1] [pid=421] Next High Priority Event Time: Tue Sep 21 08:11:49 2010
[1285049507.829969] [008.1] [pid=421] Next Low Priority Event Time:  Tue Sep 21 08:11:49 2010
[1285049507.829976] [008.1] [pid=421] Current/Max Service Checks: 0/0
[1285049508.080888] [008.1] [pid=421] ** Event Check Loop
[1285049508.080961] [008.1] [pid=421] Next High Priority Event Time: Tue Sep 21 08:11:49 2010
[1285049508.080973] [008.1] [pid=421] Next Low Priority Event Time:  Tue Sep 21 08:11:49 2010
[1285049508.080979] [008.1] [pid=421] Current/Max Service Checks: 0/0
[1285049508.331897] [008.1] [pid=421] ** Event Check Loop
[1285049508.331965] [008.1] [pid=421] Next High Priority Event Time: Tue Sep 21 08:11:49 2010
[1285049508.331979] [008.1] [pid=421] Next Low Priority Event Time:  Tue Sep 21 08:11:49 2010
[1285049508.331986] [008.1] [pid=421] Current/Max Service Checks: 0/0
[1285049508.582894] [008.1] [pid=421] ** Event Check Loop
[1285049508.582967] [008.1] [pid=421] Next High Priority Event Time: Tue Sep 21 08:11:49 2010
[1285049508.582980] [008.1] [pid=421] Next Low Priority Event Time:  Tue Sep 21 08:11:49 2010
[1285049508.582986] [008.1] [pid=421] Current/Max Service Checks: 0/0
[1285049508.833931] [008.1] [pid=421] ** Event Check Loop
[1285049508.833993] [008.1] [pid=421] Next High Priority Event Time: Tue Sep 21 08:11:49 2010
[1285049508.834005] [008.1] [pid=421] Next Low Priority Event Time:  Tue Sep 21 08:11:49 2010
[1285049508.834012] [008.1] [pid=421] Current/Max Service Checks: 0/0
[1285049509.085932] [008.1] [pid=421] ** Event Check Loop
[1285049509.086015] [008.1] [pid=421] Next High Priority Event Time: Tue Sep 21 08:11:49 2010
[1285049509.086030] [008.1] [pid=421] Next Low Priority Event Time:  Tue Sep 21 08:11:49 2010
[1285049509.086036] [008.1] [pid=421] Current/Max Service Checks: 0/0
[1285049509.086072] [008.0] [pid=421] ** Timed Event ** Type: 5, Run Time: Tue Sep 21 08:11:49 2010
[1285049509.086079] [008.0] [pid=421] ** Check Result Reaper
[1285049509.095695] [008.1] [pid=421] ** Event Check Loop
[1285049509.095709] [008.1] [pid=421] Next High Priority Event Time: Tue Sep 21 08:11:59 2010
[1285049509.095721] [008.1] [pid=421] Next Low Priority Event Time:  Tue Sep 21 08:11:49 2010
[1285049509.095727] [008.1] [pid=421] Current/Max Service Checks: 0/0
[1285049509.095736] [024.1] [pid=421] Run a few checks before executing a host check for 'test_host_096'.
[1285049509.095743] [008.1] [pid=421] Running event...
[1285049509.095758] [008.0] [pid=421] ** Timed Event ** Type: 12, Run Time: Tue Sep 21 08:11:49 2010
[1285049509.095764] [008.0] [pid=421] ** Host Check Event ==> Host: 'test_host_096', Options: 0, Latency: 0.095000 sec
[1285049509.099893] [008.1] [pid=421] ** Event Check Loop
[1285049509.099949] [008.1] [pid=421] Next High Priority Event Time: Tue Sep 21 08:11:59 2010
[1285049509.099961] [008.1] [pid=421] Next Low Priority Event Time:  Tue Sep 21 08:12:11 2010
[1285049509.099967] [008.1] [pid=421] Current/Max Service Checks: 0/0
icinga-migration commented 14 years ago

Updated by mfriedrich on 2010-09-21 11:29:31 +00:00

test cases from Stephen:

-------- Original Message --------
Subject:    Re: [Nagios-devel] Unable to stop executing checks with 3.2.2
Date:   Tue, 21 Sep 2010 00:22:36 +0100
From:   Stephen Gran 
Reply-To:   Nagios Developers List 
To:     nagios-devel@lists.sourceforge.net

On Mon, Sep 20, 2010 at 08:51:32AM +0100, Ton Voon said:
> The first thing is: can you get the test_events C code to execute  
> without failure?
> 
> The 2nd is to code your scenarios above. The tricky thing here is that  
> you have to setup all the appropriate variables, then execute  
> event_execution_loop(). As you success criteria is based on whether  
> certain events were executed or not, you could probably work this out  
> by checking the last_check and next_check for specific host/services  
> afterwards.
> 
> I look forward to see those tests!

Hi there,

This is my simple minded attempt at matching the originally reported
failure case (host checks being skipped when service checks disabled).
It fails without my patch and passes with my patch.  Hopefully it's
clear enough - please ask for clarification if not.

Cheers,
icinga-migration commented 14 years ago

Updated by mfriedrich on 2010-09-21 13:05:21 +00:00

Applied in changeset commit:"7bfd6520a55a7e7125dd642268ad9cd8b372429a".