opserver / Opserver

Stack Exchange's Monitoring System
https://opserver.github.io/Opserver/
MIT License
4.51k stars 827 forks source link

WMI Configuration with AD Credentials #191

Open mmillican opened 8 years ago

mmillican commented 8 years ago

I'm attempting to use WMI monitoring in Opserver on a domain environment. Our sys-admin created a domain user which has permissions for WMI. At first, I tried setting the app pool to run as the specified domain user, but still wasn't seeing any data.

After poking around in the config classes, I updated my DashboardSettings.json WMI section to

"wmi": {
     "nodes": [ "web01", "web02", "ops1", "mssql1", "web01-dev" ],
     "staticDataTimeoutSeconds": 300,
     "dynamicDataTimeoutSeconds": 5,
     "historyHours": 2,
     "username": "<user>",
     "password": "<pass>"
    }

but still can't get data to show. As a last effort, I updated PollingSettings.json with credentials as well. Is there something else I'm missing, or does auth for WMI not work the way I'm thinking it does?

NickCraver commented 8 years ago

WMI permissions short of admin are very tricky - are you sure they are setup correctly? There are several WMI browsing tools out there - try the credentials in them remotely as a sanity check...since that's what Opserver will be doing. No special config is needed, it'll be running as the application pool account.

mmillican commented 8 years ago

Thanks @NickCraver! Looks like that fixed most of it. Still having issues with some of the metrics (CPU/Memory and Network). Making progress though.

afscrome commented 8 years ago

Digging into this some more, it looks like the WMI requires two separate permissions:

  1. Grant the user account permission to connect through WMI / DCOM
  2. Grant the user permissions to use the root/cmiv2 namespace

For 1, I just added the user to the Performance Log Users group on the machines to be monitored.

The second step was a lot more involved - I ended up using the following script to grant permissions on the WMI namespace (It is based somewhat on https://github.com/PowerShell/WmiNamespaceSecurity/ ). Make sure to update the $domain and $username values with your own values. Also adjust the computer names on Invoke-Command to the servers you actually want to set permissions on.

Invoke-Command web1,web2,web3,web4 {
    $domain = 'PERF'
    $username = 'Monitor'

    $systemSecurity = Get-CimInstance -Namespace 'root/cimv2' -ClassName __SystemSecurity  

    $ntAccount = New-Object System.Security.Principal.NTAccount($domain, $username)
    $trustee = New-CimInstance -Namespace root/cimv2 -ClassName Win32_Trustee -ClientOnly -Property @{
        Domain=$domain
        Name=$username
        SidString=$ntAccount.Translate([System.Security.Principal.SecurityIdentifier]).Value
    }

    $ace = New-CimInstance -Namespace root/cimv2 -ClassName Win32_Ace  -ClientOnly -Property @{
        AceType=[uint32]0 #Allow
        Trustee=$trustee
        AccessMask=[uint32]0x22 #MethodExecute & RemoteAccess
        AceFlags=[uint32]0
    }

    $sd = Invoke-CimMethod -InputObject $systemSecurity -MethodName GetSecurityDescriptor | select -ExpandProperty Descriptor
    $newDacls = $sd.DACL |
        ? { $_.Trustee.Domain -ne $domain } |
        ? { $_.Trustee.Name -ne $username }

    $sd.DACL = $newDacls + $ace

    $systemSecurity = Get-CimInstance -Namespace $this.Path __systemsecurity
    $retVal = Invoke-CimMethod -InputObject $systemSecurity -MethodName SetSecurityDescriptor -Arguments @{
        Descriptor = $sd
    }

    if ($retVal.ReturnValue -ne 0) {
        throw "SetSecurityDescriptor failed with $($retVal.ReturnValue)"
    }
}
NickCraver commented 8 years ago

@mmillican Did the above permissions get you going?

mmillican commented 8 years ago

@NickCraver I seem to have missed last two comments on this. We do have some parts working, but I'll have to pass the above post onto our sys admin to see if it helps. I'll try to get an answer for you later this week.

IsaackRasmussen commented 8 years ago

@mmillican thanks!

Your two steps with the powershell script made it work for me.

blyry commented 7 years ago

Adding these lines to the above script will also take care of the local Performance Log Users group.

    $GroupObj = [ADSI]"WinNT://$env:COMPUTERNAME/Performance Log Users"
    $GroupObj.Add("WinNT://$domain/$username")
mmillican commented 7 years ago

@NickCraver Quick update: I manually added my ops user to Performance Log Users and Performance Monitor Users on a few of our servers and am seeing more info such as CPU usage and history. Still not sure I'm seeing everything (as I don't have anything to compare against), but there's some progress.

NickCraver commented 7 years ago

@mmillican Did you ever this this working? I'll be adding github pages to the repo for docs with v2 release (overhaul -> master merge), and documenting what needed here would be awesome - of course you'll be able to PR those docs too

mmillican commented 7 years ago

@NickCraver I had a little more success with granting more permissions. I'll look at my notes at work tomorrow and get them over to you. Thanks!

mmillican commented 7 years ago

@NickCraver Sorry for the delay, again.

I was able to do number 1 of afscrome's post above manually, but for some reason I wasn't able to run number two (due to permissions IIRC). I will try again tomorrow on one of my dev servers. (I made a reminder in my calendar).

mmillican commented 7 years ago

@NickCraver I'm not able to run the script due to Execution Policy and I don't have permission to change it. Will have to ask a sys-admin to give me permissions and/or run it today and will get back to you then.

afscrome commented 7 years ago

You should be able to change execution policy for your user without admin rights. (The default is to set machine level, for which you do need admin rights)

 Set-ExecutionPolicy Unrestricted -Scope CurrentUser
mmillican commented 7 years ago

@afscrome Thanks! That worked. When running the script, I got this message though:

[server_name] Connecting to remote server server_name failed with the following error message : Access is denied.

afscrome commented 7 years ago

Try invoking the script inside the Invoke-Command block directly on each machine, rather than using powershell remoting to do it.

mmillican commented 7 years ago

Assuming (I'm somewhat unfamiliar with PS) I just comment out the 1st and last lines (Invoke-Command... and the closing }), I'm still getting an access denied error.

afscrome commented 7 years ago

Oh sorry, you'll definitely need admin rights to do those.

mmillican commented 7 years ago

No problem, thanks for the help! I'll ask a sysadmin to do it and let you know what happens.

mmillican commented 7 years ago

@NickCraver @afscrome I was finally able to run the script on a dev server and I don't see any additional information from when I added the user to the "Performance Log Users" group.

To clarify, here's what I see: image

mmillican commented 7 years ago

Just a quick update, I'm trying to debug this locally to figure out where the disconnect may be from what I think I should be seeing based on what's in the Razor view.

afscrome commented 7 years ago

So I think something may have changed recently. I've just upgraded to the latest build, and now I'm no longer getting CPU stats + a wall of errors (which are continually flashing in and out).

image

Even more interestingly, when I run under my account rather than the service account I get different errors ("access denied" and "invalid class"). The access denied is very interesting since I'm a domain admin...

Stack traces of the errors below, I'm looking into this some more.

web3-Static: Unable to fetch from WMI: @
    at System.Runtime.InteropServices.Marshal.ThrowExceptionForHRInternal(Int32 errorCode, IntPtr errorInfo)
    at System.Management.ManagementObjectCollection.ManagementObjectEnumerator.MoveNext()
    at System.Linq.Enumerable.d__94`1.MoveNext()
    at System.Linq.Enumerable.FirstOrDefault[TSource](IEnumerable`1 source)
    at StackExchange.Opserver.Monitoring.Wmi.WmiQuery.d__10.MoveNext() in C:\git\Opserver\Opserver.Core\Monitoring\Wmi.cs:line 130 --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.CompilerServices.ConfiguredTaskAwaitable`1.ConfiguredTaskAwaiter.GetResult()
    at StackExchange.Opserver.Data.Dashboard.Providers.WmiDataProvider.WmiNode.d__42.MoveNext() in C:\git\Opserver\Opserver.Core\Data\Dashboard\Providers\WmiDataProvider.Polling.cs:line 492 --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.CompilerServices.ConfiguredTaskAwaitable`1.ConfiguredTaskAwaiter.GetResult()
    at StackExchange.Opserver.Data.Dashboard.Providers.WmiDataProvider.WmiNode.d__33.MoveNext() in C:\git\Opserver\Opserver.Core\Data\Dashboard\Providers\WmiDataProvider.Polling.cs:line 124 --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.CompilerServices.ConfiguredTaskAwaitable.ConfiguredTaskAwaiter.GetResult()
    at StackExchange.Opserver.Data.Dashboard.Providers.WmiDataProvider.WmiNode.d__31.MoveNext() in C:\git\Opserver\Opserver.Core\Data\Dashboard\Providers\WmiDataProvider.Polling.cs:line 37 --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
    at StackExchange.Opserver.Data.Cache`1.<>c__DisplayClass26_0.<<-ctor>b__0>d.MoveNext() in C:\git\Opserver\Opserver.Core\Data\Cache.cs:line 174
sql2014-Static: Unable to fetch from WMI: Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)) @
    at System.Runtime.InteropServices.Marshal.ThrowExceptionForHRInternal(Int32 errorCode, IntPtr errorInfo)
    at System.Management.ManagementScope.InitializeGuts(Object o)
    at System.Management.ManagementScope.Initialize()
    at System.Management.ManagementObjectSearcher.Initialize()
    at System.Management.ManagementObjectSearcher.Get()
    at StackExchange.Opserver.Monitoring.Wmi.WmiQuery.b__8_0() in C:\git\Opserver\Opserver.Core\Monitoring\Wmi.cs:line 116
    at System.Threading.Tasks.Task`1.InnerInvoke()
    at System.Threading.Tasks.Task.Execute() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.CompilerServices.ConfiguredTaskAwaitable`1.ConfiguredTaskAwaiter.GetResult()
    at StackExchange.Opserver.Monitoring.Wmi.WmiQuery.d__10.MoveNext() in C:\git\Opserver\Opserver.Core\Monitoring\Wmi.cs:line 130 --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.CompilerServices.ConfiguredTaskAwaitable`1.ConfiguredTaskAwaiter.GetResult()
    at StackExchange.Opserver.Data.Dashboard.Providers.WmiDataProvider.WmiNode.d__33.MoveNext() in C:\git\Opserver\Opserver.Core\Data\Dashboard\Providers\WmiDataProvider.Polling.cs:line 86 --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.CompilerServices.ConfiguredTaskAwaitable.ConfiguredTaskAwaiter.GetResult()
    at StackExchange.Opserver.Data.Dashboard.Providers.WmiDataProvider.WmiNode.d__31.MoveNext() in C:\git\Opserver\Opserver.Core\Data\Dashboard\Providers\WmiDataProvider.Polling.cs:line 37 --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
    at StackExchange.Opserver.Data.Cache`1.<>c__DisplayClass26_0.<<-ctor>b__0>d.MoveNext() in C:\git\Opserver\Opserver.Core\Data\Cache.cs:line 174

and

dc1-Static: Unable to fetch from WMI: Invalid class @
    at System.Management.ManagementException.ThrowWithExtendedInfo(ManagementStatus errorCode)
    at System.Management.ManagementObjectCollection.ManagementObjectEnumerator.MoveNext()
    at System.Linq.Enumerable.d__94`1.MoveNext()
    at System.Linq.Enumerable.FirstOrDefault[TSource](IEnumerable`1 source)
    at StackExchange.Opserver.Monitoring.Wmi.WmiQuery.d__10.MoveNext() in C:\git\Opserver\Opserver.Core\Monitoring\Wmi.cs:line 130 --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.CompilerServices.ConfiguredTaskAwaitable`1.ConfiguredTaskAwaiter.GetResult()
    at StackExchange.Opserver.Data.Dashboard.Providers.WmiDataProvider.WmiNode.d__36.MoveNext() in C:\git\Opserver\Opserver.Core\Data\Dashboard\Providers\WmiDataProvider.Polling.cs:line 315 --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.CompilerServices.ConfiguredTaskAwaitable.ConfiguredTaskAwaiter.GetResult()
    at StackExchange.Opserver.Data.Dashboard.Providers.WmiDataProvider.WmiNode.d__32.MoveNext() in C:\git\Opserver\Opserver.Core\Data\Dashboard\Providers\WmiDataProvider.Polling.cs:line 65 --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.CompilerServices.ConfiguredTaskAwaitable`1.ConfiguredTaskAwaiter.GetResult()
    at StackExchange.Opserver.Data.Dashboard.Providers.WmiDataProvider.WmiNode.d__31.MoveNext() in C:\git\Opserver\Opserver.Core\Data\Dashboard\Providers\WmiDataProvider.Polling.cs:line 44 --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
    at StackExchange.Opserver.Data.Cache`1.<>c__DisplayClass26_0.<<-ctor>b__0>d.MoveNext() in C:\git\Opserver\Opserver.Core\Data\Cache.cs:line 174
mmillican commented 7 years ago

I'm late to the error party (admission, I should have found this sooner), but I found some Access Denied errors from the WMI provider. Unfortunately I don't know what node(s) these are for, so I will continue debugging this.

afscrome commented 7 years ago

I've go this working in my local environment now and fixed a bug that was causing CPU metrics to not display - can you try #252 ? AT the very least that will update exceptions to include the machine name of the server.

mmillican commented 7 years ago

You beat me to the machine name logging! I'll give it a try later tonight. Thanks!

mmillican commented 7 years ago

It appears my current errors are Access Denied across a variety of our servers. Will start poking shortly.

mcsaunders commented 6 years ago

I just wanted to add my experience in trying to get non-admin permissions configured in case it's of any help (spoiler: I couldn't get it working). The steps taken were:

  1. Added the service account to the Performance Log Users and Performance Monitor Users local user groups (Performance Log Users grants required DCOM permissions)
  2. Granted WMI permissions using the script above
  3. Extended the WMI permissions to grant Enable Account and also apply to This namespace and subnamespaces.
  4. Per several different articles pointing at KB907460, ran sc sdset SCMANAGER D:(A;;CCLCRPRC;;;AU)(A;;CCLCRPWPRC;;;SY)(A;;KA;;;BA)S:(AU;FA;KA;;;WD)(AU;OIIOFA;GA;;;WD) to grant permissions on the Windows Service Control Manager (without this no service info was returned).

I was then able to use both WBEM and WMI Explorer with the service account credentials to remotely query WMI, in effect proving the permissions worked, and yet Opserver is still not showing everything it should. It's slightly confusing, as some of the info is populated, and some isn't.

image

Making the service account local admin resolves these issues so that's how I'm going to proceed. Thanks.

pkunze commented 5 years ago

I am experiencing the same behaviour as @mcsaunders. I would be able and willing to provide more info, if that helps resolving the issue. Is there anything I can do to help?