dsccommunity / SChannelDsc

MIT License
12 stars 6 forks source link

Cipher: Consuming a lot of CPU #22

Closed gwendolinemaksic closed 1 year ago

gwendolinemaksic commented 3 years ago

Details of the scenario you tried and the problem that is occurring

CPU consumption is going to 100% after a few days on some computers every time the cipher resource is used.

This issue occurs on about a dozen of servers.

We opened a ticket at Microsoft Premier and after investigation we concluded that the issue of CPU consumption is related to the SChannelDSC module and that we must contact the team in charged of this module.

Verbose logs showing the problem

Please see the attached document containing our exchanges with the Microsoft support team. sChannelDSC.docx

Suggested solution to the issue

The DSC configuration that is used to reproduce the issue (as detailed as possible)

Please find hereunder the DSC configuration with a YAML format (used by the datum module)

SChannelDsc:
  Cipher:
    - Cipher: "Null"
      State: Disabled
    - Cipher: "RC4 40/128"
      State: Disabled
    - Cipher: "RC4 56/128"
      State: Disabled
    - Cipher: "RC4 64/128"
      State: Disabled
    - Cipher: "RC4 128/128"
      State: Disabled
    - Cipher: "DES 56/56"
      State: Disabled
    - Cipher: "Triple DES 168"
      State: Disabled
    - Cipher: "AES 128/128"
      State: Enabled
    - Cipher: "AES 256/256"
      State: Enabled
  Hash:
    - Hash: MD5
      State: Disabled
    - Hash: SHA
      State: Enabled
    - Hash: SHA256
      State: Enabled
    - Hash: SHA384
      State: Enabled
    - Hash: SHA512
      State: Enabled
  KeyExchangeAlgorithm:
    - KeyExchangeAlgorithm: Diffie-Hellman
      State: Default
    - KeyExchangeAlgorithm: ECDH
      State: Default
    - KeyExchangeAlgorithm: PKCS
      State: Default
  Protocol:
    - Protocol: "SSL 3.0"
      IncludeClientSide: True
      State: Disabled
    - Protocol: "TLS 1.0"
      IncludeClientSide: True
      State: Disabled
    - Protocol: "TLS 1.1"
      IncludeClientSide: True
      State: Disabled
    - Protocol: "TLS 1.2"
      IncludeClientSide: True
      State: Enabled
  SChannelSettings:
    - IsSingleInstance: Yes
      TLS12State: Enabled
      DiffieHellmanMinClientKeySize: 1024 
      DiffieHellmanMinServerKeySize: 1024 
      KerberosSupportedEncryptionType: 
        - AES128-HMAC-SHA1
        - AES256-HMAC-SHA1
      WinHttpDefaultSecureProtocols: TLS1.2
      EnableFIPSAlgorithmPolicy: False

The operating system the target node is running

OsName : Microsoft Windows Server 2019 Datacenter OsOperatingSystemSKU : DatacenterServerEdition OsArchitecture : 64-bit WindowsVersion : 1809 WindowsBuildLabEx : 17763.1.amd64fre.rs5_release.180914-1434 OsLanguage : en-US OsMuiLanguages : {en-US}

Version of Windows that is used (e.g. Windows Server 2016)

Microsoft Windows Server 2019 Datacenter with graphical user interface

Version and build of PowerShell the target node is running

PSVersion 5.1.17763.2183 PSEdition Desktop PSCompatibleVersions {1.0, 2.0, 3.0, 4.0...} BuildVersion 10.0.17763.2183 CLRVersion 4.0.30319.42000 WSManStackVersion 3.0 PSRemotingProtocolVersion 2.3 SerializationVersion 1.1.0.1

Version of the DSC module that was used

1.2.2

gaelcolas commented 3 years ago

Hi Gwendoline, sounds like it's not necessarily the resource but it could be the DSC process just doing its job (I've only had a quick glance at the exchange you had with MSFT).

The DSC service (the LCM) runs within the WMI process (sort of), so that's normal (and the people you talked to don't seem to know much about DSC). Can you check whether your DSC Configuration is compliant (Get-DscConfigurationStatus or look for the Json file on your machine). I suspect it may just take too much CPU to run it every 15min, and you may want to configure the LCM to run every hour or so.

gaelcolas commented 3 years ago

By that I meant the DSC resource just calls to the underlying windows command, which might just be too resource intensive to be ran every 15min as per LCM's default configuration. So if the DSC resource does what it's supposed to do, but it just consumes too much CPU, my guess is it's not DSC/the resource's fault, and the right approach might be to run those checks less frequently.

gwendolinemaksic commented 3 years ago

Hello Gael,

The local configuration manager is usually always stuck since a little time after the last reboot. Today it was stuck since the 24th. The Get-DscConfigurationStatus command returns the following message:

Error Message: Cannot invoke the Get-DscConfigurationStatus cmdlet. The Consistency Check or Pull cmdlet is in progress and must return before Get-DscConfigurationStatus can be invoked. Use -Force option if that is available to cancel the current operation.

We have always to kill the WMI Provider Host to get the control back on the local configuration manager. We can provide the json file but we will not attach it to the current issue because it contains sensitive information.

From our point of view, the DSC configuration is not just taking longer as expected, it is indefinitely stuck. In the present case, the latest json file is dated Friday, September 24th.

As soon as we disable the SChannelDSC module, everything is working fine again.

ykuijs commented 2 years ago

Hi @gwendolinemaksic, after reviewing the case notes you shared, I am not entirely sure what is causing this issue. The Cipher resource only updates a registry key and isn’t doing anything fancy, just like the other resources of SChannelDsc. I have tried to reproduce the issue, but unfortunately wasn’t able to so. That is why I am looking for a little more information.

You mentioned that you are also using Datum. Can you share a little more information about the Datum configuration, like the Composite resources you are using and the contents of the datum.yml and datum structure for the SChannel resources? And maybe also parts of the created MOF files that are relevant for SChannelDsc?

johlju commented 1 year ago

Closing this as per previous comment and no more information was provided. Feel free to reopen if it is possible to reproduce it.