meerkat-dashboard / meerkat

Drag-and-drop dashboards for Icinga
https://meerkat.run
GNU Affero General Public License v3.0
18 stars 2 forks source link

Elements that return zero results need to check on a schedule or icinga client deployment #178

Closed sol1-matt closed 1 year ago

sol1-matt commented 1 year ago

The current implementation element update is based on events from icinga (individual services and hosts) to match the results of a elements previous API call, this match triggers the element to redo it's API call.

If the previous API call doesn't match any results then the events will never match the results as there are none.

This means that any elements which return zero results need to recheck on a reasonably quick schedule. eg 1 minuite.

Status based elements could also fail if they are part of a group of service (eg: not a individual service or host) and that group in icinga changes and the addition's state changes.

eg: Element 1 is Service A: OK Service B: OK

Icinga changes and adds Service C: Critical which would match Element 1 but meerkat doesn't know until Service A or B changes.

Another solution instead of a schedule would be to redo all these api calls after Icinga deploy's new configuration

sol1-matt commented 1 year ago

A larger question to this is what happens to the subscribed event streams on reload, restart and stop/start.

Instances where we need to, recheck everything are

sol1-matt commented 1 year ago

possible solution to both would be to look at the icinga2 status for changes in program start.

/v1/status/IcingaApplication returns a dict shown below. The results[0].status.icingaapplication.app.program_start changes when icinga2 reloads or restarts. A meerkat scheduled check every 30 seconds would detect most problem and could trigger a re-initialization of event streams and make elements as stale needing api updates.

{
    "results": [
        {
            "name": "IcingaApplication",
            "perfdata": [],
            "status": {
                "icingaapplication": {
                    "app": {
                        "enable_event_handlers": true,
                        "enable_flapping": true,
                        "enable_host_checks": true,
                        "enable_notifications": true,
                        "enable_perfdata": true,
                        "enable_service_checks": true,
                        "environment": "",
                        "node_name": "example.com",
                        "pid": 11111,
                        "program_start": 1685555555.319866,
                        "version": "r2.13.7-20"
                    }
                }
            }
        }
    ]
}

The big advantage here would be this is a active check from meerkat to icinga to detect when things need to be poked with a stick.

sol1-matt commented 1 year ago

the <dashbaord>/update functionality can be used for system restart/reload