dsccommunity / StorageDsc

DSC resource module is used to manage storage on Windows Servers.
https://dsccommunity.org
MIT License
70 stars 51 forks source link

Simple Formatting Drive Failing via StorageDsc 4.1.0.0 #172

Closed ronhowe closed 5 years ago

ronhowe commented 6 years ago

#### Details of the scenario you tried and the problem that is occurring

We have had the same implementation in DSC configuration for a long time and it has worked on many servers. In this last version of the DSC Resource Kit including StorageDsc 4.1.0.0 provisioning a new server, we're now getting errors formatting the E$ drive. The Disk 1 gets mounted and a volume is created. The drive actually does get formatted and later configuration is creating files and folders there. However, Test-* is failing for some reason so overall the configuration is deemed to be failing.

#### Verbose logs showing the problem

VERBOSE: [SAASMANAGE01]: LCM: [ Start Resource ] [[Disk]FormatEDrive] VERBOSE: [SAASMANAGE01]: LCM: [ Start Test ] [[Disk]FormatEDrive] VERBOSE: [SAASMANAGE01]: [[Disk]FormatEDrive] Test-TargetResource: Testing disk with Number '1' status for drive letter 'E'. VERBOSE: [SAASMANAGE01]: [[Disk]FormatEDrive] Test-TargetResource: Checking if disk with Number '1' is initialized. Failed Activity ID: {88a8fe9c-f006-47f4-b9d4-8b7516fb9d95}

VERBOSE: [SAASMANAGE01]: [[Disk]FormatEDrive] Perform operation 'Query CimInstances' with following parameters, ''queryExpression' = SELECT BlockSize from Win32_Volume WHERE DriveLetter = 'E:','queryDialect' = WQL,'namespaceName' = root\cimv2'. VERBOSE: [SAASMANAGE01]: [[Disk]FormatEDrive] Operation 'Query CimInstances' complete. VERBOSE: [SAASMANAGE01]: LCM: [ End Test ] [[Disk]FormatEDrive] in 1.0620 seconds. The PowerShell DSC resource '[Disk]FormatEDrive' with SourceInfo 'C:\VisualStudio\IDIBilling\IDI_IDI\Infrastructure\DesiredStateConfiguration\Implementations\IDI\Configuration\Configuration.psm1::1500::13::Disk' threw one or more non-terminating errors while running the Test-TargetResource functionality. These errors are logged to the ETW channel called Microsoft-Windows-DSC/Operational. Refer to this channel for more details.

#### The DSC configuration that is used to reproduce the issue (as detailed as possible)

        # Mounts hard disk 1, which is always expected to become E$.
        WaitforDisk "MountEDrive" {
            DiskId           = Resolve-DscConfigurationProperty -Node $Node -PropertyName "EDriveDiskId"
            RetryIntervalSec = Resolve-DscConfigurationProperty -Node $Node -PropertyName "WaitForDiskRetryIntervalSec"
            RetryCount       = Resolve-DscConfigurationProperty -Node $Node -PropertyName "WaitForDiskRetryCount"
            DependsOn        = @("[Script]DisableFloppyDisks", "[Script]DisableOpticalDisks")
        }

        # Formats the E$ drive.
        Disk "FormatEDrive" {
            DiskId             = Resolve-DscConfigurationProperty -Node $Node -PropertyName "EDriveDiskId"
            DriveLetter        = "E"
            FSLabel            = Resolve-DscConfigurationProperty -Node $Node -PropertyName "BaseSqlServerSettings\EDriveFileSystemLabel"
            AllocationUnitSize = Resolve-DscConfigurationProperty -Node $Node -PropertyName "BaseSqlServerSettings\EDriveAllocationUnitSize"
            DependsOn          = "[WaitforDisk]MountEDrive"
        }

        # PowerShell can error if new drives are added within the current session.
        # This resource implements a workaround for that to ensure the drive is available before future configuration.
        WaitForVolume "WaitForEDrive" {
            DriveLetter      = "E"
            RetryIntervalSec = Resolve-DscConfigurationProperty -Node $Node -PropertyName "WaitForDiskRetryIntervalSec"
            RetryCount       = Resolve-DscConfigurationProperty -Node $Node -PropertyName "WaitForDiskRetryCount"
            DependsOn        = "[Disk]FormatEDrive"
        }

#### The operating system the target node is running

OsName : Microsoft Windows Server 2016 Datacenter OsOperatingSystemSKU : DatacenterServerEdition OsArchitecture : 64-bit WindowsBuildLabEx : 14393.447.amd64fre.rs1_release_inmarket.161102-0100 OsLanguage : en-US OsMuiLanguages : {en-US}

#### Version and build of PowerShell the target node is running

Name Value


PSVersion 5.1.14393.206 PSEdition Desktop PSCompatibleVersions {1.0, 2.0, 3.0, 4.0...} BuildVersion 10.0.14393.206 CLRVersion 4.0.30319.42000 WSManStackVersion 3.0 PSRemotingProtocolVersion 2.3 SerializationVersion 1.1.0.1

#### Version of the DSC module that was used ('dev' if using current dev branch)

StorageDsc 4.1.0.0

PlagueHO commented 6 years ago

Hi @ronhowe - it looks like there is some problem with Get-PartitionSupportedSize that gets the supported partition size on the disk using CIM.

What is the result if you run the following PowerShell command on the DSC target machine:

Get-Partition -DiskNumber 1 | Get-PartitionSupportedSize
ronhowe commented 6 years ago
PS C:\Windows\system32> Get-Partition -DiskNumber 1 | Get-PartitionSupportedSize

  SizeMin   SizeMax
  -------   -------
134217728 134217728
Get-PartitionSupportedSize : Failed
Activity ID: {f301d1c0-32b0-47d3-9843-25a53e5cf0eb}
At line:1 char:31
+ Get-Partition -DiskNumber 1 | Get-PartitionSupportedSize
+                               ~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (StorageWMI:ROOT/Microsoft/.../MSFT_Partition) [Get-PartitionSupportedSize
   ], CimException
    + FullyQualifiedErrorId : StorageWMI 4,Get-PartitionSupportedSize
ronhowe commented 6 years ago

The same commands works on our QA box. `

PS C:\Windows\system32> Get-Partition -DiskNumber 1 | Get-PartitionSupportedSize

   SizeMin    SizeMax
   -------    -------
  33554432   33554432
3273650176 8555314688

They're the same OS, same PowerShell, same DSC configuration.

The only difference is the size of the E$. In QA it is ~8 GB. In Production it is ~50 GB.

ronhowe commented 6 years ago

More debugging.

I deleted the volume and offlined the disk.

I re-ran the current configuration and it completed successfully.

I re-ran the current configuration again and it now fails on that test of the drive.

To reiterate, it's actually implementing correctly. It appears to just be failing consistency tests after the fact.

ronhowe commented 6 years ago

Relevant DSC Operations entries in the event log.

Log Name: Microsoft-Windows-DSC/Operational Source: Microsoft-Windows-DSC Date: 9/18/2018 8:23:52 AM Event ID: 4252 Task Category: None Level: Error Keywords:
User: SYSTEM Computer: SAASMANAGE01.idibill.com Description: Job {9BC6FBD9-BB3D-11E8-A818-005056A2F90F} : MIResult: 1 Error Message: The SendConfigurationApply function did not succeed. LCM failed to start desired state configuration manually. Message ID: MI RESULT 1 Error Category: 0 Error Code: 1 Error Type: MI Event Xml:

4252 0 2 0 0 0x4000000000000000 21006 Microsoft-Windows-DSC/Operational SAASMANAGE01.idibill.com {9BC6FBD9-BB3D-11E8-A818-005056A2F90F} 1 The SendConfigurationApply function did not succeed. LCM failed to start desired state configuration manually. MI RESULT 1 0 1 MI Log Name: Microsoft-Windows-DSC/Operational Source: Microsoft-Windows-DSC Date: 9/18/2018 8:23:32 AM Event ID: 4252 Task Category: None Level: Error Keywords: User: SYSTEM Computer: SAASMANAGE01.idibill.com Description: Job {9BC6FBD9-BB3D-11E8-A818-005056A2F90F} : MIResult: 1 Error Message: The PowerShell DSC resource '[Disk]FormatEDrive' with SourceInfo 'C:\VisualStudio\IDIBilling\IDI\_IDI\Infrastructure\DesiredStateConfiguration\Implementations\IDI\Configuration\Configuration.psm1::1500::13::Disk' threw one or more non-terminating errors while running the Test-TargetResource functionality. These errors are logged to the ETW channel called Microsoft-Windows-DSC/Operational. Refer to this channel for more details. Message ID: NonTerminatingErrorFromProvider Error Category: 7 Error Code: 1 Error Type: MI Event Xml: 4252 0 2 0 0 0x4000000000000000 20928 Microsoft-Windows-DSC/Operational SAASMANAGE01.idibill.com {9BC6FBD9-BB3D-11E8-A818-005056A2F90F} 1 The PowerShell DSC resource '[Disk]FormatEDrive' with SourceInfo 'C:\VisualStudio\IDIBilling\IDI\_IDI\Infrastructure\DesiredStateConfiguration\Implementations\IDI\Configuration\Configuration.psm1::1500::13::Disk' threw one or more non-terminating errors while running the Test-TargetResource functionality. These errors are logged to the ETW channel called Microsoft-Windows-DSC/Operational. Refer to this channel for more details. NonTerminatingErrorFromProvider 7 1 MI Log Name: Microsoft-Windows-DSC/Operational Source: Microsoft-Windows-DSC Date: 9/18/2018 8:23:32 AM Event ID: 4097 Task Category: Send configuration scenario for Local configuration manager Level: Error Keywords: User: SYSTEM Computer: SAASMANAGE01.idibill.com Description: Job {9BC6FBD9-BB3D-11E8-A818-005056A2F90F} : This event indicates that failure happens when LCM is processing the configuration. Error Id is 0x1. Error Detail is The SendConfigurationApply function did not succeed.. Resource Id is [Disk]FormatEDrive and Source Info is C:\VisualStudio\IDIBilling\IDI\_IDI\Infrastructure\DesiredStateConfiguration\Implementations\IDI\Configuration\Configuration.psm1::1500::13::Disk. Error Message is The PowerShell DSC resource '[Disk]FormatEDrive' with SourceInfo 'C:\VisualStudio\IDIBilling\IDI\_IDI\Infrastructure\DesiredStateConfiguration\Implementations\IDI\Configuration\Configuration.psm1::1500::13::Disk' threw one or more non-terminating errors while running the Test-TargetResource functionality. These errors are logged to the ETW channel called Microsoft-Windows-DSC/Operational. Refer to this channel for more details.. Event Xml: 4097 0 2 1 0 0x4000000000000000 20927 Microsoft-Windows-DSC/Operational SAASMANAGE01.idibill.com {9BC6FBD9-BB3D-11E8-A818-005056A2F90F} LCM 0x1 The SendConfigurationApply function did not succeed. [Disk]FormatEDrive C:\VisualStudio\IDIBilling\IDI\_IDI\Infrastructure\DesiredStateConfiguration\Implementations\IDI\Configuration\Configuration.psm1::1500::13::Disk The PowerShell DSC resource '[Disk]FormatEDrive' with SourceInfo 'C:\VisualStudio\IDIBilling\IDI\_IDI\Infrastructure\DesiredStateConfiguration\Implementations\IDI\Configuration\Configuration.psm1::1500::13::Disk' threw one or more non-terminating errors while running the Test-TargetResource functionality. These errors are logged to the ETW channel called Microsoft-Windows-DSC/Operational. Refer to this channel for more details. Log Name: Microsoft-Windows-DSC/Operational Source: Microsoft-Windows-DSC Date: 9/18/2018 8:23:31 AM Event ID: 4252 Task Category: None Level: Error Keywords: User: SYSTEM Computer: SAASMANAGE01.idibill.com Description: Job {9BC6FBD9-BB3D-11E8-A818-005056A2F90F} : MIResult: 1 Error Message: Failed Activity ID: {76fa075f-e886-45ec-bfcd-527039ba5542} Message ID: StorageWMI 4,Get-PartitionSupportedSize Error Category: 0 Error Code: 1 Error Type: MI Event Xml: 4252 0 2 0 0 0x4000000000000000 20920 Microsoft-Windows-DSC/Operational SAASMANAGE01.idibill.com {9BC6FBD9-BB3D-11E8-A818-005056A2F90F} 1 Failed Activity ID: {76fa075f-e886-45ec-bfcd-527039ba5542} StorageWMI 4,Get-PartitionSupportedSize 0 1 MI Log Name: Microsoft-Windows-DSC/Operational Source: Microsoft-Windows-DSC Date: 9/18/2018 8:23:31 AM Event ID: 4103 Task Category: None Level: Error Keywords: User: SYSTEM Computer: SAASMANAGE01.idibill.com Description: Job {9BC6FBD9-BB3D-11E8-A818-005056A2F90F} : This event indicates that a non-terminating error was thrown when DSCEngine was executing Test-TargetResource on MSFT_Disk DSC resource. FullyQualifiedErrorId is StorageWMI 4,Get-PartitionSupportedSize. Error Message is Failed Activity ID: {76fa075f-e886-45ec-bfcd-527039ba5542}. Event Xml: 4103 0 2 0 0 0x4000000000000000 20919 Microsoft-Windows-DSC/Operational SAASMANAGE01.idibill.com {9BC6FBD9-BB3D-11E8-A818-005056A2F90F} DSCEngine Test-TargetResource MSFT_Disk StorageWMI 4,Get-PartitionSupportedSize Failed Activity ID: {76fa075f-e886-45ec-bfcd-527039ba5542}
ronhowe commented 6 years ago

LOL

WOW

I found this article in Japanese and used Bing to translate it to English.

https://urabe8kaneyosi.wordpress.com/2018/01/25/get-partitionsupportedsize%E3%81%AE%E5%AE%9F%E8%A1%8C%E3%81%AB%E5%A4%B1%E6%95%97%E3%81%99%E3%82%8B/

In essence, the "Optimize drives" service (defragsvc) can not be set to Startup = Disabled as this server was. It has to be set to Manual (or "higher").

I did this and it worked.

Freaking crazy.

Time to add DSC to ensure this service is set correctly before implementing storage configurations.

johlju commented 6 years ago

@ronhowe That is some awesome detective work! Sounds like the documentation for the resource should be updated with this as a requirement. Not sure if it serves any purpose, but it could also be added as a regression test to the integration tests, to see that it fails. Maybe one day it doesn't, then the requirement could be removed. 🙂

PlagueHO commented 5 years ago

Wow! Good job detecting that one @ronhowe - that is some great information. @johlju - I'll submit a PR to add some documentation for this as it is definitely an obscure "gotcha"!