sensu-plugins / sensu-plugins-windows

Sensu Windows Plugins
http://sensu-plugins.io
MIT License
22 stars 46 forks source link

perfhelper.ps1 should check if perf counters started before creating perfhash.hsh #83

Open robertkaucher opened 5 years ago

robertkaucher commented 5 years ago

We had a few new systems added this week and the "Server Manager Performance Monitor" collector set had not been started on these and the metric plugins were sending strange/no data. I realized the counters were not started and started them on all the servers but this did not have any impact on the metrics. I found that Get-PerformanceCounterByID in perfhelper.ps1 was creating a cache file but was not checking if the performance counters were started and if they were not most of the metrics would not function correctly. I'm not sure if the scripts should start the counters or simply not create the $hashFile unless the counters are started.

$datacollectorset = New-Object -COM Pla.DataCollectorSet
$datacollectorset.Query("Server Manager Performance Monitor",$null)
if ($datacollectorset.Status -ne 0) {
    Export-Clixml -InputObject $perfHash -Path $hashfile
}

Or perhaps the counters should be started since anyone attempting to use the plugins will need this to be the case to use them any way.

function Get-PerformanceCounterByID
{
    param
    (
        [Parameter(Mandatory=$true)]
        $Name
    )
    $hashfile = (Join-Path $PSScriptRoot perfhash.hsh)
   $datacollectorset = New-Object -COM Pla.DataCollectorSet
   $datacollectorset.Query("Server Manager Performance Monitor",$null)
   if ($datacollectorset.Status -eq 0) {
    $datacollectorset.Start($true)
        if((Test-Path $hashFile)){
             Remove-Item $hashFile
        }
   }
    if ([System.IO.File]::Exists($hashfile)) {       
        $perfHash = Import-Clixml -Path $hashfile
    }
    if ($perfHash -eq $null)
    {
        $key = 'Registry::HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Perflib\009'
        $counters = (Get-ItemProperty -Path $key -Name Counter).Counter
        $perfHash = @{}
        $all = $counters.Count
         for($i = 0; $i -lt $all; $i+=2)
        {
           $perfHash.$($counters[$i+1]) = $counters[$i]
        }
        Export-Clixml -InputObject $perfHash -Path $hashfile
    }
    $perfHash.$Name
}

It's probably wise to perform a synchronous start on the data collector set and delete the $hashFile if it has been previously created. https://docs.microsoft.com/en-us/windows/desktop/api/pla/nf-pla-idatacollectorset-start

robertkaucher commented 5 years ago

I just tested this and it looks like starting the data collector set will not work. At the least, the $hashFile probably should not be created.

The operator or administrator has refused the request. (Exception from 
HRESULT: 0x800710E0)
At C:\sensu-go\plugins\PowerShell\perfhelper.ps1:44 char:2
+     $datacollectorset.Start($true)
+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : OperationStopped: (:) [], COMException
    + FullyQualifiedErrorId : System.Runtime.InteropServices.COMException
majormoses commented 5 years ago

@csabo any thoughts? I don't really know what I am talking about with windows...

csabo commented 5 years ago

@majormoses I haven’t personally seen this issue, but it looks legit, I’ll try to squeeze some testing in tomorrow morning, and then work with @robertkaucher on a fix as needed.

Sent with GitHawk

csabo commented 5 years ago

@robertkaucher So I agree that dropping the hashFile will be the way to do this. Rather than using the Get-* helper functions, I did a quick test just using Get-Counter directly, and I was seeing about a 1 second per Get-Counter call on total execution time, which is quicker than I was seeing using the original helper functions. Below is my test with metrics-windows-system.ps1.

If starting the counters doesn't work, would wrapping in some conditional to return a message stating the counter isn't started, or something of the like meet your needs?

param(
    [switch]$UseFullyQualifiedHostname
    )

$ThisProcess = Get-Process -Id $pid
$ThisProcess.PriorityClass = "BelowNormal"

. (Join-Path $PSScriptRoot perfhelper.ps1)

if ($UseFullyQualifiedHostname -eq $false) {
    $Path = ($env:computername).ToLower()
}else {
    $Path = [System.Net.Dns]::GetHostEntry([string]"localhost").HostName.toLower()
}

$count_interrupt = [System.Math]::Round((Get-Counter "\Processor Information(_total)\Interrupts/sec" -SampleInterval 1 -MaxSamples 1).CounterSamples.CookedValue)
$count_context = [System.Math]::Round((Get-Counter "\System\Context Switches/sec" -SampleInterval 1 -MaxSamples 1).CounterSamples.CookedValue)
$Time = DateTimeToUnixTimestamp -DateTime (Get-Date)

Write-Host "$Path.system.irq_per_second $count_interrupt $Time"
Write-Host "$Path.system.context_switches_per_second $count_context $Time"
robertkaucher commented 5 years ago

@csabo I think what should be done is more of a philosophical question about how plugin errors should be communicated to Sensu users. Using write-host may not be ideal because then someone is going to need to be running the scripts manually to see it. If there is already a defined approach for what to do when plugins start throwing errors, I'd just follow that and maybe use write-host as a courtesy. It was by manually running the plugins that we discovered the issues after all. As long as the cache file is not created before the counters are started, I'll be happy with the solution.

I suppose another concern will be if additional metrics are added at a later point and new counters need to be included that were not available when the system was initially added to Sensu. How necessary is the hash file? If it doesn't need to be created and this would not impact the performance of the metric plugins, then that would be the best option as not generating it fixes all of the problems in one go.