Icinga / icinga2

The core of our monitoring platform with a powerful configuration language and REST API.
https://icinga.com/docs/icinga2/latest
GNU General Public License v2.0
2k stars 574 forks source link

CIB: active_service_checks_(1min|5min) returns invalid values in large environments #9096

Open TheFireMike opened 2 years ago

TheFireMike commented 2 years ago

Describe the bug

In large environments the active_service_checks_(1min|5min) counter of the CIB returns invalid values.

Examples:

Result of REST API endpoint /status/CIB:

...
"active_service_checks_15min": 130324,
"active_service_checks_1min": 5,
"active_service_checks_5min": 18920,
...

Same request 10 seconds later:

...
"active_service_checks_15min": 130158,
"active_service_checks_1min": 4,
"active_service_checks_5min": 25,
...

Again 10 seconds later:

...
"active_service_checks_15min": 126361,
"active_service_checks_1min": 2,
"active_service_checks_5min": 1598,
...

To Reproduce

  1. Create a large Icinga environment
  2. Query the REST API Endpoint /status/CIB

Expected behavior

A response which returns how many checks were actually active during the last minute / last 5 minutes.

Your Environment

cmaile commented 2 years ago

Same behavior with version 2.13.4.

carraroj commented 5 months ago

ref/NC/815287

Al2Klimov commented 4 months ago

Icinga doesn't interpolate anything here as the statistics buffers are large enough. Are you sure you don't just have checks spikes like this?