lordmilko / PrtgAPI

C#/PowerShell interface for PRTG Network Monitor
MIT License
305 stars 38 forks source link

Windows update KB4345418 on Windows 10 breaks Get-Channel #30

Closed Gladiator10864 closed 6 years ago

Gladiator10864 commented 6 years ago

I have been running a script just for a while now without issues until after patching my workstation with KB4345418. Since that patch, anything using Get-Channel started producing this error:

Get-Channel : An error occurred while attempting to deserialize XML element 'injected_showchart' to property 'ShowInGraph': cannot assign 'null' to value type 'Boolean'.
At line:1 char:51
+ ... -Device servertest1 | Get-Sensor "SNMP CPU*" | Get-Channel * -Verbose
+                                                    ~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [Get-Channel], XmlDeserializationException
    + FullyQualifiedErrorId : PrtgAPI.XmlDeserializationException,PrtgAPI.PowerShell.Cmdlets.GetChannel

Microsoft has acknowledged that this patch breaks some COM components (https://support.microsoft.com/en-us/help/4345418/windows-10-update-kb4345418). However, I'm unsure how this actually affects the PrtgAPI.

I can confirm that by rolling back this patch resolved my issues though. I have not yet attempted their workaround fix with KB4346877 which was released on July 30th (https://support.microsoft.com/en-us/help/4346877/july-30-2018-kb4346877-os-build-14393-2396)

Are you able to confirm that my issues are indeed from this Windows patch or is it some other odd coincidence? I am running latest PrtgAPI version.

lordmilko commented 6 years ago

Hi @Gladiator10864,

The implication here is either that you have a sensor that somehow does not have a Show In Graphs setting in one of its channels, or the windows update somehow broke something in the .NET Framework

Looking at the output from Get-Channel -Verbose in your script you should be able to see the sensor ID and channel ID of the channel that is causing PrtgAPI to crash

Can you please open the settings dialog of this channel and confirm a. Whether it has a Graph Rendering setting b. What its value is c. Does this affect doing Get-Channel on any sensor, or just SNMP CPU ones?

image

I will install this update myself and see what happens

lordmilko commented 6 years ago

I tried very hard to install KB4345418 on several Windows 10 1607 x64 VMs with various prior rollups installed, however could not get it to install properly - it kept failing and rolling back

I was able to install KB4340917 on Windows 10 1803 x64 which is supposed to contain the same issue, however was able to do Get-Channel on all sensors on my server

While installing the fix for this update may resolve the issue, I am interested to confirm there is in fact anything wrong with PrtgAPI

Step 1

Confirm whether the issue affects all sensors (or a random sampling thereof)

Step 2

On a given sensor known to cause issues with Get-Channel, confirm whether it a. has a Graph Rendering property and b. what its value is (as per the screenshot above)

Step 3

Manually invoke the internal GetChannelProperties method on the PrtgClient that returns the XML calculated from executing and parsing the web request

First, verify that it affects the first sensor returned by PRTG (since this is what all the code below uses)

Get-Sensor -count 1 | Get-Channel # Confirm this crashes

Then execute the following

$sensor = Get-Sensor -count 1

$client = get-prtgclient

$method = $client.GetType().GetMethod("GetChannelProperties", [Reflection.BindingFlags]::Instance -bor [Reflection.BindingFlags]::NonPublic)

$method.Invoke($client, [object[]]@($sensor.Id, 0)).ToString()

An expected response is as follows

<properties>
  <injected_showchart>1</injected_showchart>
  <injected_show>1</injected_show>
  <injected_colmode>0</injected_colmode>
  <injected_color></injected_color>
  <injected_linewidth>1</injected_linewidth>
  <injected_percent>0</injected_percent>
  <injected_ref100percent></injected_ref100percent>
  <injected_ref100percent_factor>8E-6</injected_ref100percent_factor>
  <injected_decimalmode>0</injected_decimalmode>
  <injected_decimaldigits>2</injected_decimaldigits>
  <injected_spikemode>0</injected_spikemode>
  <injected_spikemax></injected_spikemax>
  <injected_spikemax_factor>8E-6</injected_spikemax_factor>
  <injected_spikemin></injected_spikemin>
  <injected_spikemin_factor>8E-6</injected_spikemin_factor>
  <injected_axismode>0</injected_axismode>
  <injected_axismax></injected_axismax>
  <injected_axismax_factor>8E-6</injected_axismax_factor>
  <injected_axismin></injected_axismin>
  <injected_axismin_factor>8E-6</injected_axismin_factor>
  <injected_limitmode>0</injected_limitmode>
  <injected_limitmaxerror></injected_limitmaxerror>
  <injected_limitmaxerror_factor>8E-6</injected_limitmaxerror_factor>
  <injected_limitmaxwarning></injected_limitmaxwarning>
  <injected_limitmaxwarning_factor>8E-6</injected_limitmaxwarning_factor>
  <injected_limitminwarning></injected_limitminwarning>
  <injected_limitminwarning_factor>8E-6</injected_limitminwarning_factor>
  <injected_limitminerror></injected_limitminerror>
  <injected_limitminerror_factor>8E-6</injected_limitminerror_factor>
  <injected_limiterrormsg></injected_limiterrormsg>
  <injected_limitwarningmsg></injected_limitwarningmsg>
</properties>

While PowerShell seems to mess up the values of some of the elements (all those "8E-6" values are "1") we can see that many fields have values of some kind, and that the injected_showchart field is at the top of the list...meaning if they are all empty, it will be the first to fail! Do any elements have values for you, or are they all empty?

Step 4

If all of the fields are empty, this implies either the request is returning nothing, or the response is formatted slightly differently, causing the regular expressions that extract the properties to fail to match anything

In this case, running the following code will generate a file C:\output.htm that contains the raw response that was returned from the API, as seen by the PrtgAPI engine (as opposed to executing the API request with Invoke-WebRequest, which may format things differently)

$client = get-prtgclient

$sensor = get-sensor -count 1

$privateFlags = [Reflection.BindingFlags]::Instance -bor [Reflection.BindingFlags]::NonPublic

# Get the method
$requestEngine = $client.GetType().GetField("requestEngine", $privateFlags).GetValue($client)
$methods = $requestEngine.GetType().GetMethods($privateFlags)|where { $_.returntype -eq [string] }
$method = $methods|where { $_.GetParameters() | where { $_.ParameterType.ToString() -eq "PrtgAPI.HtmlFunction" } }

# Get the function
$enumType = $client.GetType().Assembly.GetType("PrtgAPI.HtmlFunction")
$values = [enum]::GetValues($enumType)
$function = $values|where { $_.ToString() -eq "ChannelEdit" }

# Get the parameters
$parameterType = $client.GetType().Assembly.GetType("PrtgAPI.Parameters.ChannelPropertiesParameters")
$parameterObj = [activator]::CreateInstance($parameterType, [object[]]@($sensor.Id, 0))

# Get the response
[IO.File]::WriteAllText("C:\output.htm", $method.Invoke($requestEngine, [object[]]@($function.PSObject.BaseObject, $parameterObj, $null)))

If you can verify this file does not contain any sensitive information (it doesn't on my system) and attach the file to this issue, I can get PrtgAPI to try and parse this file on my system and see whether the .NET HttpClient class potentially formatted the response differently (which is the only thing I can think of that would use a COM component internally), causing the effects that we are seeing

Gladiator10864 commented 6 years ago

Thanks for the detailed response.

To summarize a few of my findings from yesterday after posting. Windows 10 Build 14393 Before KB4345418: Worked fine Windows 10 Build 14393 After KB4345418: Broken, produced above error Windows 10 Build 14393 after removal of KB4345418: Worked fine again Windows 10 Build 14393 after installing KB4346877 on top of KB4345418: Still broken Server 2016 fully patched just before testing: Works fine All tests were using the same command, "Get-Device servertest1. | Get-Sensor "Ping" | Get-Channel " I've also tested many other sensors piped to Get-Channel, all of which produce the same error.

My process from the failing workstation with KB4345418:

PS H:\> Get-Device servertest1.* -Verbose
VERBOSE: Get-Device: Synchronously executing request https://myurl.edu/api/table.xml?content=devices&columns=location,host,group,probe,favorite,condition,upsens,downsens,downacksens,partialdownsens,warnsens,pausedsens,unusualsens,undefined
sens,totalsens,schedule,basetype,baselink,parentid,notifiesx,interval,intervalx,access,dependency,position,status,comments,priority,message,type,tags,active,objid,name&count=*&filter_name=@sub(servertest1.)&username=myuser&passhash=mypasshash

Name                        Id     Status      Host            Sensors    Group                     Probe                    
----                        --     ------      ----            -------    -----                     -----                    
servertest1                 42608  Up          servertest1.... 12         testing                    PRTGPRODEV01             

PS H:\> Get-Device servertest1.* | Get-Sensor "Ping" -Verbose
VERBOSE: Get-Sensor: Synchronously executing request https://myurl.edu/api/table.xml?content=sensors&columns=probe,group,favorite,lastvalue,device,downtime,downtimetime,downtimesince,uptime,uptimetime,uptimesince,knowntime,cumsince,lastche
ck,lastup,lastdown,minigraph,schedule,basetype,baselink,parentid,notifiesx,interval,intervalx,access,dependency,position,status,comments,priority,message,type,tags,active,objid,name&count=*&filter_name=Ping&filter_parentid=42608&username=myuser&passhash=mypasshash

Name                         Id     Device               Group                     Probe                 Status        
----                         --     ------               -----                     -----                 ------        
Ping                         42609  servertest1          testing                    PRTGPRODEV01          Up            

PS H:\> Get-Device servertest1.* | Get-Sensor "Ping" | Get-Channel * -Verbose
VERBOSE: Get-Channel: Synchronously executing request https://myurl.edu/api/table.xml?content=channels&columns=lastvalue,objid,name&count=*&id=42609&username=myuser&passhash=mypasshash
VERBOSE: Get-Channel: Synchronously executing request https://myurl.edu/controls/channeledit.htm?id=42609&channel=2&username=myuser&passhash=mypasshash
VERBOSE: Get-Channel: Synchronously executing request https://myurl.edu/controls/channeledit.htm?id=42609&channel=3&username=myuser&passhash=mypasshash
VERBOSE: Get-Channel: Synchronously executing request https://myurl.edu/controls/channeledit.htm?id=42609&channel=0&username=myuser&passhash=mypasshash
VERBOSE: Get-Channel: Synchronously executing request https://myurl.edu/controls/channeledit.htm?id=42609&channel=1&username=myuser&passhash=mypasshash
Get-Channel : An error occurred while attempting to deserialize XML element 'injected_showchart' to property 'ShowInGraph': cannot assign 'null' to value type 'Boolean'.
At line:1 char:48
+ Get-Device servertest1.* | Get-Sensor "Ping" | Get-Channel * -Verbose
+                                                ~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [Get-Channel], XmlDeserializationException
    + FullyQualifiedErrorId : PrtgAPI.XmlDeserializationException,PrtgAPI.PowerShell.Cmdlets.GetChannel

Step 1

I first discovered this error by attempting to run a script against almost 600 devices/sensors. The Get-Channel failed on every single call.

Step 2

Can confirm that the "Ping Time" channel of above example has Graph Rendering set to "Show in Graphs" image

Step 3

Here's what I get:

PS H:\> Get-Sensor -Count 1 | Get-Channel
Get-Channel : An error occurred while attempting to deserialize XML element 'injected_showchart' to property 'ShowInGraph': cannot assign 'null' to value type 'Boolean'.
At line:1 char:23
+ Get-Sensor -Count 1 | Get-Channel
+                       ~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [Get-Channel], XmlDeserializationException
    + FullyQualifiedErrorId : PrtgAPI.XmlDeserializationException,PrtgAPI.PowerShell.Cmdlets.GetChannel

PS H:\> $sensor = Get-Sensor -count 1

PS H:\> $client = get-prtgclient

PS H:\> $method = $client.GetType().GetMethod("GetChannelProperties", [Reflection.BindingFlags]::Instance -bor [Reflection.BindingFlags]::NonPublic)

PS H:\> $method.Invoke($client, [object[]]@($sensor.Id, 0)).ToString()
<properties />

Step 4

GitHub doesn't support .htm file formats, so I've converted it to .txt for uploading output.txt

Please let me know if there's anything more you'd like to see.

lordmilko commented 6 years ago

The response included in your file output.txt is totally incorrect on multiple levels. There is not a single normal <input tag included in this response, and the order of the items is wrong - the first item on the page should be Graph Rendering, not all the Limit properties

Can you please confirm what version of PRTG you're running?

Also are you running a 32-bit or 64-bit version of Windows?

Are you potentially able to confirm whether installing KB4346877 without prior installing KB4345418 causes the issue?

Can you then perform the following tests

Step 1

Copy and paste the following URL into your web browser (after adjusting the server, username and passhash) https://myurl.edu/controls/channeledit.htm?id=42609&channel=0&username=myuser&passhash=mypasshash

Do you get a response with radio buttons and text fields as follows?

image

or do you just get a bunch of text

image

Step 2

Execute the command (after adjusting server, username and passhash)

Invoke-WebRequest "https://myurl.edu/controls/channeledit.htm?id=42609&channel=0&username=myuser&passhash=mypasshash"

Can you visibly see any <input fields with type="text" in the response? There should be one near the bottom

image

Step 3

Execute the command (after adjusting server, username and passhash)

(new-object System.Net.WebClient).DownloadString("https://myurl.edu/controls/channeledit.htm?id=42609&channel=0&username=myuser&passhash=mypasshash")

Can you visibly see any <input fields with type="text" in the response?

Step 4

Execute the command (after adjusting server, username and passhash)

(new-object System.Net.Http.HttpClient).getasync("https://myurl.edu/controls/channeledit.htm?id=42609&channel=0&username=myuser&passhash=mypasshash").Result.content.readasstringasync().result

Can you visibly see any <input fields with type="text" in the response? HttpClient is the method PrtgAPI uses internally to invoke web requests. Therefore I am interested to know whether there is any difference between the three methods of executing requests

Step 5

If possible, can you potentially disable HTTPS on your PRTG server and then install and run Fiddler, then execute a failing Get-Channel request. This will allow you to inspect all HTTP traffic on your system, and allow us to inspect the raw response before it potentially gets mangled by the .NET Framework. Simply click on the request to channeledit.htm?id=42609&channel=0 on the left, then select the TextView in the middle of the screen on the right to view the response. Once again, I want to know whether there are any <input tags`

Alternatively, you can potentially attempt to run Fiddler without disabling HTTPS on your PRTG server; you should be told upon clicking on an encrypted request you need to enable Fiddler's decryption capabilities, and then you can re-run the request. If that doesn't work however it's generally simpler to just target an unencrypted request

Gladiator10864 commented 6 years ago

@lordmilko My apologies but I got caught up with some other things at work today and didn't have time to dig into this any further. Unfortunately, I won't be back into the office until Monday either.

In the meantime I did get things working properly with the install of KB4346877 without KB4345418 at least.

Our current production version of PRTG is 18.2.41.1652, however the same issue was witnessed in our dev environment which I updated to latest yesterday.

All devices (Clients and servers) are running 64 bit Windows environments.

We've had so many other issues with .NET patches breaking applications lately that it's a joke. I'm convinced that it has to be something with this patch but I'd be happy to work with you on this further next week if you want to determine the root cause. I'll check back in few days. Thanks for your assistance so far and enjoy your weekend! ;)

lordmilko commented 6 years ago

Thanks @Gladiator10864

Can you also confirm how exactly you're installing these updates?

I have performed multiple tests of installing Windows 10 1607, on both physical and virtual machines, both 32-bit and 64-bit. The furthest rollup I can possibly install appears to be the April 17 rollup, which was achieved by hiding the update to Windows 10 1803 and letting Windows Update automatically install every update that was available

Attempting to manually install the May 8 rollup or anything newer fails. As such, the furthest version of Windows 10 1607 I have been able to achieve is 14393.2214. The fact updates from May on aren't being offered over Windows Update to me indicates Microsoft knows there is some kind of compatibility issue with them; attempting to identify the reason these updates were reverting proved too frustrating

In addition to the tests above, I would also be interested to know if Get-Sensor -count 1 | Get-ObjectProperty -Raw returns any properties with values, as Get-ObjectProperty uses the exact same <input tag parsing technique Get-Channel does, albeit on a different web page.

From my research, the only possible explanation as to the phenomena we're seeing I've been able to see is there might be something wrong with the content type of the request. Fiddler will allow us to inspect any anomalous headers/content types that may be getting injected here, with the end goal being an update to PrtgAPI to explicitly state the required parameters so a user does not experience issues even if they do have a dodgy update installed

lordmilko commented 6 years ago

Hi @Gladiator10864,

Just following up on this. Have you had any progress with the above items?

If you are doing these tests on your primary work computer, this testing could potentially too onerous. In that case, given we have successfully resolved your initial issue we can potentially close this issue until someone else reports the same issue in the future

lordmilko commented 6 years ago

Hi @Gladiator10864,

As I haven't heard back from you and this issue appears to have been caused by a Windows Update, I will close this issue for now. If I hear back from anyone in the future I will look into potentially reopening this issue to identify the root cause

Regards, lordmilko