Icinga / icinga-powershell-framework

This PowerShell module will allow to fetch data from Windows hosts and use them for inventory and monitoring solutions. Together with the Icinga Web 2 module, a detailed overview of your Windows infrastructure will be drawn.
MIT License
74 stars 33 forks source link

DateTime Threshold Bug including possible Solution #698

Open audiocoach opened 5 months ago

audiocoach commented 5 months ago

I already opened an issue in the icinga-powershell-plugins repository but noticed during troubleshooting that this is a bug within the framework an not within the plugins. Thats why I make a copy of it here. For reference please see https://github.com/Icinga/icinga-powershell-plugins/issues/380

Here my original posting:

Hello,

I think I have found a bug in the Invoke-IcingaCheckScheduledTask. Let,s say you have a scheduled task which should run every 24 hours and want a warning if the next runtime for whatever reason is 48 hours or more in the future and a critical if the next run time is 72 hours or more in the future. But when you take these values for the "WarningNextRunTime" (=48 hours or 2 days) and "CriticalNextRunTime" (=72 hours or 3 days) you always get critical because the scheduled next run time is always lower than the current time + 3 days. Even if you take lower threshold values there will be a point in time where you get a critical. Let's say the scheduled task runs every 24 hours at 06:00 and you take 1 hour as critical next runtime threshold. At 05:01 the check will return a critical because 06:00 is lower that 05:01 (current time) + 1 hour (critical next runtime threshold).

I think it can only work the other way around and you should get a warning if the next runtime is higher than the current time + the warning threshold value and a critical if the next runtime is higher than the current time + the critical threshold value.

Maybe it is easier to understand with some screenshots:

image

image

My environment: Icinga 2.14.1 Icinga Web 2.12.1 Icinga Director 1.11 Icinga for Windows 1.11.1

audiocoach commented 5 months ago

And here my solution post:

I think I found a solution for this bug. You have to modify the Compare-IcingaPluginThresholds.psm1 at line 224 from the powershell framework (located at C:\Program Files\WindowsPowerShell\Modules\icinga-powershell-framework\lib\icinga\plugin) as follows:

if (([string]::IsNullOrEmpty($ThresholdValue) -eq $FALSE) -And (($TimeThreshold) -le '0') -And ($DateTimeValue -eq 0 -Or $DateTimeValue -lt $CurrentDate)) {

an add the following at line 234:

if (([string]::IsNullOrEmpty($ThresholdValue) -eq $FALSE) -And (($TimeThreshold) -gt '0') -And ($DateTimeValue -gt $CurrentDate)) { $IcingaThresholds.InRange = $FALSE; $IcingaThresholds.Message = 'is greater than'; $IcingaThresholds.Range = [string]::Format( '{0} ({1}{2})', ((Get-Date).ToString('yyyy\/MM\/dd HH:mm:ss')), ( $( if ($TimeThreshold -gt 0) { '+'; } else { ''; } )), $Threshold ); }

With that if you enter postive values for the "WarningNextRunTime" and "CriticalNextRunTime" thresholds (e.g. 1d or 1h) it checks if the value (= Next Run Time) is greater than the current date + threshold and if so throws a warning or critical.

LordHepipud commented 4 weeks ago

It took a while to wrap my head around this issue but now I get why the output confused me so much.

Invoke-IcingaCheckScheduledTask -TaskName 'Renew Certificate' -CriticalLastRunTime '-2d' -CriticalNextRunTime '3d' -State Running, Ready, Queued -Verbosity 3 -WarningLastRunTime '-1d' -WarningNextRunTime '2d'
[CRITICAL] Scheduled Tasks [CRITICAL] \Icinga\Icinga for Windows\ (All must be [OK])
\_ [CRITICAL] \Icinga\Icinga for Windows\ (All must be [OK])
   \_ [CRITICAL] Renew Certificate (All must be [OK])
      \_ [OK] Last Run Time: 2024/08/16 15:42:42
      \_ [OK] Last Task Result: 0
      \_ [OK] Missed Runs: 0
      \_ [CRITICAL] Next Run Time: 2024/08/17 01:00:00 is lower than 2024/08/16 17:04:40 (+3d)
      \_ [OK] State: Ready

Now I finally get it - by default the plugin will compare anything like input value < threshold. The actual threshold is then the current date + (or -) the input.

Now this basically means that all TaskNextRun thresholds are simply wrong, because they will mostly be larger than the next scheduled run.

I'm not sure if simply changing the global logic will resolve this issue. It is more an underlying issue to properly compare date time values.

The DateTime handler should itself also follow the guidelines for plugin thresholds and allow a more dynamic approach for this.

To make this work properly, the argument should not be 3d, but ~:3d for example.

But this needs to be implemented for the DateTime comparison, like for all regular thresholds.