Open openglx opened 9 years ago
I'm giving a look at this.
I think this is the expected behavior with the parameter: notification_interval 0
It can be dangerous to allow restart notification logic with minor state change (you where in problem and you are still in problem, so why notify?).
Allowing this can add change the expected behavior from before I thik and will increase notification for some users.
I do nto close this ticket as it can be an enhancement if I'm not wrong in my analysis, but we dhould talk about it before modifying this ^^
I disagree with your comment of 'minor state change'. I don't think that the notification logic should be only between OK/non-OK state but should consider all possible states and their transitions.
In the case above there is legitimate reason to the on-call group only receive transitions towards CRITICAL ('c' flag) or RECOVERY from it ('r' flag allied with 'c'). The other group would receive any message and any change between states.
Again, this was something that previously worked in 1.4 with similar configuration (not using notificationways).
I've read the ticket. and I agree that this is either a regression from 1.4 or otherwise at least a documentation issue or miss.
The use-case is quite clear and desirable imo : have some "oncall" contact(s) get only critical (& recoveries from it) notifications.
As #1329
IMO this is a big fix to do. Even if I commit that I won't merge it for 2.4 as it's RC. Postponing.
Any news on this bug?
This issue is incredibly similar to what we was reported in #1329 - service notifications are not being sent when a service goes from warning to critical.
Our notification goals are similar: groups receive e-mails for any case (recovery, warning, critical) but on-call mobiles only receive for critical and recovery (of a critical).
Tests on 2.0.3 using one host, one service, one notification command, two contacts in a group. Each contact has a different notification way:
After starting Shinken 2.0.3 (check_service returning 0):
Changing check_service to return 1, only notifies the group [as it is the expected behaviour]:
Changing it to return 2 we would expect both contacts to receive a notification, but only the on-call are receiving [we expect both to receive]:
Clearing it to OK (check_service returning 0) notifies both contacts [as expected, as it should have notified both]:
For the record, warning->ok and ok->warning are working as expected [only the contact named "group" receives those]:
It is worth mentioning that if it transitions from OK->CRITICAL both contacts are receiving the notification, and eventual recovery:
I do not recall these issues happening on 1.4. I've tested same config against latest 2.2 from git and confirm it still happens there.