Icinga / icinga2

The core of our monitoring platform with a powerful configuration language and REST API.
https://icinga.com/docs/icinga2/latest
GNU General Public License v2.0
1.99k stars 573 forks source link

Gracefully stop/restart/reload icinga2 #9286

Open manfredw opened 2 years ago

manfredw commented 2 years ago

Is your feature request related to a problem? Please describe.

Sometimes it is necessary to stop or restart/reload icinga2 (configuration changes, OS updates, crashes caused by communication problemes between icinga2 nodes,...), these are usualy triggered by Director or on CLI.

This will kills all currently running processes with signal 15 (SIGTERM): checks, notifications and icinga itself. It seems that killed checks and notifications are marked as timed out and this state is stored persistant (internal and in IDO?).

After a restart all this hardly interupted checks are shown as UNKNOWN and you have to wait for the next regular scheduled check to gain a "real" status. This sometimes leads to confusing states and dashboards with hundreds of apparent of problems.

Describe the solution you'd like

Implement a graceful shutdown of running check and notification scripts by waiting a (configurable) time for completion, prevent starting new scripts by disabling the scheduler.

Describe alternatives you've considered

Do not store state information of hardly killed scripts during shutdown process.

Additional context

Medium to large deployment with (redundant) satellites, systems running latest releases on Linux OS.

Al2Klimov commented 1 year ago

Do not store state information of hardly killed scripts during shutdown process.

Good idea!