qk4l / zabbix-cachet

Python script which sync Zabbix IT Services with Cachet
MIT License
80 stars 32 forks source link

Zabbix connection is not restored automatically #68

Open backaf opened 3 years ago

backaf commented 3 years ago

Environment:

Cachet: 2.3.18 Zabbix-cachet: 1.3.7 Zabbix: 4.0.29

Cachet and Zabbix-cachet run inside containers, deployed in a Kubernetes cluster. Zabbix runs on a VM.

Problem

We have an automatic patch process that stops the Zabbix server before upgrading our MySQL cluster. This process is executed on a monthly basis. After the cluster has been upgraded and Zabbix is started again, Zabbix-cachet is no longer restoring the connection to Zabbix. The result of this is that no new incidents are created until the zabbix-cachet container is restarted.

The following message occurs every 2 minutes:

2021-04-09 07:18:49 DEBUG: (Trigger Watcher) Resetting dropped connection: <hostname>

Full log:

2021-04-09 07:47:33  DEBUG: (Trigger Watcher) check Zabbix triggers
2021-04-09 07:47:33  DEBUG: (Trigger Watcher) Sending: {
    "jsonrpc": "2.0",
    "params": {},
    "id": 395757,
    "method": "apiinfo.version"
}
2021-04-09 07:47:33  DEBUG: (Trigger Watcher) Resetting dropped connection: <hostname>
2021-04-09 07:47:34  DEBUG: (Trigger Watcher) https://<hostname>:443 "POST /api_jsonrpc.php HTTP/1.1" 200 47
2021-04-09 07:47:34  DEBUG: (Trigger Watcher) Response Code: 200
2021-04-09 07:47:34  DEBUG: (Trigger Watcher) Response Body: {
    "jsonrpc": "2.0",
    "id": 395757,
    "result": "4.0.29"
}
2021-04-09 07:47:34  DEBUG: (Trigger Watcher) Sending: {
    "auth": "<>",
    "jsonrpc": "2.0",
    "params": {
        "triggerids": "18129",
        "expandDescription": "true",
        "expandComment": "true"
    },
    "id": 395758,
    "method": "trigger.get"
}
2021-04-09 07:47:34  DEBUG: (Trigger Watcher) https://<hostname>:443 "POST /api_jsonrpc.php HTTP/1.1" 200 409
2021-04-09 07:47:34  DEBUG: (Trigger Watcher) Response Code: 200
2021-04-09 07:47:34  DEBUG: (Trigger Watcher) Response Body: {
    "jsonrpc": "2.0",
    "id": 395758,
    "result": [
        {
            "recovery_expression": "",
            "correlation_tag": "",
            "triggerid": "18129",
            "status": "0",
            "priority": "3",
            "type": "0",
            "comments": "",
            "correlation_mode": "0",
            "expression": "{19114}<>0 or {19115}<>200",
            "value": "0",
            "flags": "0",
            "templateid": "18128",
            "lastchange": "1616589197",
            "recovery_mode": "0",
            "url": "",
            "description": "Confluence not reachable",
            "manual_close": "0",
            "error": "",
            "state": "0"
        }
    ]
}

Once the zabbix-cachet container is restarted past incidents are updated in Cachet.