v-zhuravlev / zabbix-notify

Notify alarms from Zabbix to Slack Hipchat and PagerDuty
GNU General Public License v3.0
133 stars 35 forks source link

alarm-no-delete race condition ? 2 api_token for distinct workspace and slack channel #46

Open libralinux opened 4 years ago

libralinux commented 4 years ago

Hi,

I am using the --alarm-no-delete option to make 'resolved' automatic changes in slack channel. Sometimes, it works and sometimes not. I do not know if it comes from the fact i have 2 distincts slack media configuration; Actually, i duplicate the slack media to post into several and distinct workspace and channel (ie different api_token)

Below a screenshot of one channel

screen1

Another channel in a distinct workspace with different api_token bot, but with the same eventid :

screen2

Could it be a race condition on the same "eventid" number message, so that slack only resolved the first "New alarm" message ?

Any idea ? i have the same configuration for the messaging and json post message for each slack actions.

libralinux commented 4 years ago

Sometimes it works, and sometimes not. I identified some random errors messages in zabbix_server.log when executing zbx-notify with slack error : channel_not_found. :

  21511:20200323:030737.265 Failed to execute command "/usr/lib/zabbix/alertscripts/zbx-notify '#zabbix' 'OK: ALERT CPU Idle too low' '{^M
            "fallback": "[[dev-bck1.XXXXXX.fr:ALERT CPU Idle too low:OK]]",^M
            "pretext": "Resolved",^M
            "color": "#027a5a",^M
            "author_name": "[[dev-bck1.cor-e.fr]]",^M
            "title": "[[ALERT CPU Idle too low]]",^M
            "title_link": "https://XXXXXXXX/tr_events.php?triggerid=26733&eventid=7790286",^M
            "text": "[[ALERT : Avergage CPU idle time < 10% on the last 15min]]",^M
            "fields": [^M
                {^M
                    "title": "Status",^M
                    "value": "OK",^M
                    "short": true^M
                },^M
                {^M
                    "title": "Severity",^M
                    "value": "High",^M
                    "short": true^M
                },^M
                {^M
                    "title": "Time",^M
                    "value": "2020.03.23 00:27:32",^M
                    "short": true^M
                },^M
                {^M
                    "title": "EventID",^M
                    "value": "eventid: 7790286",^M
                    "short": true^M
                },^M
                {^M
                    "title": "Recovery Time",^M
                    "value": "2020.03.23 00:49:32",^M
                    "short": true^M
                }^M
            ],^M
            "actions": [^M
                {^M
                    "name" : "url",^M
                    "text": "Open in Zabbix",^M
                    "type": "button",^M
                    "url" : "https://XXXXXXXXX/tr_events.php?triggerid=26733&eventid=7790286",^M
                    "style" : "primary"^M
                }           ^M
        ],^M
        "text": "Latest Value : [[system.cpu.util[,idle]: 19.93 %]]"^M
        }' '--api_token=xoxb-XXXXXXX' '--slack' '--no-fork' '--slack_mode=alarm-no-delete'": Error channel_not_found

and then....

[[dev-bck1.XXXXXXXX.fr:ALERT CPU Idle too low:OK]]
[[dev-bck1.XXXXXXXX.fr]]
[[ALERT CPU Idle too low]]
[[ALERT : Avergage CPU idle time < 10% on the last 15min]]
[[system.cpu.util[,idle]: 19.93 %]]
{^M
            "fallback": "dev-bck1.XXXXXXXX.fr:ALERT CPU Idle too low:OK",^M
            "pretext": "Resolved",^M
            "color": "#027a5a",^M
            "author_name": "dev-bck1.XXXXXXX.fr",^M
            "title": "ALERT CPU Idle too low",^M
            "title_link": "https://XXXXXXXXX/tr_events.php?triggerid=26733&eventid=7790286",^M
            "text": "ALERT : Avergage CPU idle time < 10% on the last 15min",^M
            "fields": [^M
                {^M
                    "title": "Status",^M
                    "value": "OK",^M
                    "short": true^M
                },^M
                {^M
                    "title": "Severity",^M
                    "value": "High",^M
                    "short": true^M
                },^M
                {^M
                    "title": "Time",^M
                    "value": "2020.03.23 00:27:32",^M
                    "short": true^M
                },^M
                {^M
                    "title": "EventID",^M
                    "value": "eventid: 7790286",^M
                    "short": true^M
                },^M
                {^M
                    "title": "Recovery Time",^M
                    "value": "2020.03.23 00:49:32",^M
                    "short": true^M
                }^M
            ],^M
            "actions": [^M
                {^M
                    "name" : "url",^M
                    "text": "Open in Zabbix",^M
                    "type": "button",^M
                    "url" : "https://XXXXXXXXXX/tr_events.php?triggerid=26733&eventid=7790286",^M
                    "style" : "primary"^M
                }           ^M
        ],^M
        "text": "Latest Value : system.cpu.util[,idle]: 19.93 %"^M
        }
Alarm recovery message!