mpellegrin / nagios-eventhandler-cachet

A Nagios event handler to push Nagios notifications to Cachet API
MIT License
50 stars 17 forks source link

fixed post/incident bug #4

Closed dpina000 closed 7 years ago

dpina000 commented 8 years ago

This fixed issue #3: https://github.com/mpellegrin/nagios-eventhandler-cachet/issues/3

dpina000 commented 8 years ago

and fixed a typo

mpellegrin commented 8 years ago

Thank you!

Yes, it fixes the typo #2

I will try it today and merge if it works :)

mpellegrin commented 8 years ago

It works for #3 but an other problem happens for me when trying to update the incident:

Can't find incident "[Nagios] TEST"

The incident is in the database but is not returned by the API, I don't know what can be wrong. By the way, reading all existing incidents to find the right one using its name is bad. The solution may be to store in a state file the "opened-but-not-closed" incidents, as I suggested in https://github.com/mpellegrin/nagios-eventhandler-cachet/issues/1#issuecomment-135406341

I am using Cachet 2.0.3.

I will branch and try something...

dpina000 commented 8 years ago

Can you show your test case? I've been raising 'OK' 'HARD' incidents and then closing it without issues so maybe it's a particular incident or component_status causing it.

ankush-grover-3pg commented 8 years ago

Hi dpina000,

I tried you code but it seems I am also seeing the same issue which mpellegrin is taking about. Actually I am using the code against Icinga2 which has compatibility with Nagios/Icinga1 plugins. With your code the incidents are getting opened on the Cachet Status Page and but are not getting closed after the recovery or no change in the Component Status even in Ok + Soft state.

For Icinga2 there are changes in the argument numbers. But rest of the code remains the same.

$cachet_component = $argv[2]; $service_name = $argv[4]; $service_status = $argv[8]; $service_status_type = $argv[10]; $service_output = $argv[6];

ankush-grover-3pg commented 8 years ago

This is how I am trying to create/update/closed the incidents

Critical + Hard state ./cachet-notify-new 'cachet_component' 'newtest' 'service_name' 'JIRA-test' 'service-output' 'notok' 'service_state' 'CRITICAL' 'service_state_type' 'HARD' KO HARD: creating incident

Ok + Soft state ./cachet-notify-new 'cachet_component' 'newtest' 'service_name' 'JIRA-test' 'service-output' 'notok' 'service_state' 'OK' 'service_state_type' 'SOFT' OK SOFT: updating incident Can't find incident " JIRA-test"

Ok + Hard State ./cachet-notify-new 'cachet_component' 'newtest' 'service_name' 'JIRA-test' 'service-output' 'notok' 'service_state' 'OK' 'service_state_type' 'HARD' OK HARD: updating incident Can't find incident " JIRA-test"

2Belette commented 8 years ago

Hi thanks for the commit now I am able to the the component status changed. it is working for Operational -> Major Outage and for Major Outage -> Operational but regarding "Partial Outage" it is only working for Operational -> Partial outage but the status is not cleared when the situation is comming back to Operational in Nagios (Partial Outage -> Operational), any idea why?