Icinga / icinga2

The core of our monitoring platform with a powerful configuration language and REST API.
https://icinga.com/docs/icinga2/latest
GNU General Public License v2.0
2k stars 574 forks source link

InfluxDB2 Writer (and quite possibly all other Data Outputs) may inhibit core functionality #10159

Open RincewindsHat opened 2 weeks ago

RincewindsHat commented 2 weeks ago

Describe the bug

When an Icinga 2 setup is configured with to write Performance data to an InfluxDB2 (Influxdb2Writer) and the communication partner in question (an influxdb2 instance in most cases) ist very slow to answer or not really responding at all, the internal buffer of the Influxdb2writer feature will just keep on growing. While this in itself is feature at first glance, it comes with consequences, especially when trying to reload/restart icinga2.

When receiving the signal to shutdown icinga2 will try flush the cache and write out all the cached data. This fails if the influxdb2 is currently out of order and icinga2 will be trapped in the process of shutting down.

To Reproduce

  1. Configure icinga2 with something that generates perfdata and an Influxdb2Writer
  2. Stop the influxdb2 instance during normal operation (or drop the packages or something that prohibits succesfully transfering data.
  3. Trigger a reload of icinga2
  4. Wait forever

Expected behavior

Since in most cases (IMHO) the core functionality of executing checks and sending notifications is more important than writing Performance Data, killing off all "non-essential" or "secondary" features after a timeout and accepting data loss to maintain the "essential" or "primary" functionality should be the behaviour of icinga2. Or at least configurable.