microsoft / Microsoft365DSC

Manages, configures, extracts and monitors Microsoft 365 tenant configurations
https://aka.ms/M365DSC
MIT License
1.61k stars 501 forks source link

Equivalent of terraform plan when running CI/CD deployment #4488

Open MattWhite-personal opened 7 months ago

MattWhite-personal commented 7 months ago

Revisiting M365 DSC after a while where I have had other priorities and getting my head back around the devops whitepaper from @ykuijs

The logic runs to

Build

Deploy

Validate (optional recurring step)

This logic works to monitor for the drift away from known good but doesn't allow for verification of what will change when the deploy stage runs.

When I have been working with our IaC tooling (namely terraform over the past few months) our CI/CD pipeline will run a terraform plan against the configured code and outline what items will be Created net new, changed from their current state, redeployed because the volume of change necessitates it, destroyed because the resource should no longer exist.

The step that becomes most valuable in the lifecycle process is the review allows the team to validate that only the resources they expect to change will be changed.

Reading https://learn.microsoft.com/en-us/powershell/dsc/resources/get-test-set?view=dsc-1.1 it appears that PowerShell DSC has a similar construct of Test and then Set but I cant work out if this would work against M365 DSC config and generated MOF File?

Ideal outcome - as part of a deployment of a compiled MOF file there is a validation stage where an approved admin can 👍 the changes knowing what resources will be modified

Is this something that the team are looking / already exists in the product set more broadly?

andikrueger commented 7 months ago

Just to make sure I understand your question correctly:

you want to have an approval step before any or a single modifications would happen?

is this correct?

MattWhite-personal commented 7 months ago

Essentially yes.

Looking back at terraform again the default option on terraform apply is to prompt that you want to change the various resources and in ci/cd this would normally have an --auto-approve flag added because the validation happened earlier in the release pipeline

DSC seems to do this when it does test-resource to ensure it only changes what should be modified for efficiency but having a full list of "resources to change" would help validate that the things to change will work as expected

Sent from Outlook for Androidhttps://aka.ms/AAb9ysg


From: Andi Krüger @.> Sent: Saturday, March 23, 2024 12:03:04 PM To: microsoft/Microsoft365DSC @.> Cc: Matthew White @.>; Author @.> Subject: Re: [microsoft/Microsoft365DSC] Equivalent of terraform plan when running CI/CD deployment (Issue #4488)

Just to make sure I understand your question correctly:

you want to have an approval step before any or a single modifications would happen?

is this correct?

— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/Microsoft365DSC/issues/4488#issuecomment-2016470949, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AR2ZDKXPB2V4ETADP2X6MLTYZVVPRAVCNFSM6AAAAABFEQY7UCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJWGQ3TAOJUHE. You are receiving this because you authored the thread.Message ID: @.***>

andikrueger commented 7 months ago

For this kind of scenario you need to run Test-DSCConfiguration. This will test your current configuration and will return false if there is a drift detected. Additionally you will have entries within the windows event log. These entries will hold information about the desired and current values for all resources with drifts.

You can use this information to either approve the whole configuration (equals Start-DSCConfiguration) or you need to run Invoke-DSCResource with your parameter set, if you just want to apply changes to one resource.

MattWhite-personal commented 7 months ago

Thanks, done some testing and it looks like Test-DSCConfiguration doesn't provide a similar level of output to see what needs to change when the configuration is applied

It does show the resources that are not in a compliant state but not what is missing.

I did some checking back at some really old deploy runs that I did and notice that the deploy logic does a

Get-CurrentValues Get-TargetValues Test-Current vs Target if Test-TargetResource returns False then Set-TargetResource

2021-11-13T15:07:42.4695857Z VERBOSE: [vm-agent-pool]:                            [[SPOTenantSettings]TenantSettings] Current Values: 
2021-11-13T15:07:42.4793805Z ApplyAppEnforcedRestrictionsToAdHocRecipients=True; Credential=***; Ensure=Absent; 
2021-11-13T15:07:42.4800259Z FilePickerExternalImageSearchEnabled=True; HideDefaultThemes=False; IsSingleInstance=Yes; 
2021-11-13T15:07:42.4801347Z LegacyAuthProtocolsEnabled=True; MarkNewFilesSensitiveByDefault=AllowExternalSharing; MaxCompatibilityLevel=15; 
2021-11-13T15:07:42.4802586Z MinCompatibilityLevel=15; NotificationsInSharePointEnabled=True; OfficeClientADALDisabled=False; 
2021-11-13T15:07:42.4809457Z OwnerAnonymousNotification=True; PublicCdnAllowedFileTypes=CSS,EOT,GIF,ICO,JPEG,JPG,JS,MAP,PNG,SVG,TTF,WOFF; 
2021-11-13T15:07:42.4821818Z PublicCdnEnabled=False; SearchResolveExactEmailOrUPN=False; SignInAccelerationDomain=; 
2021-11-13T15:07:42.4829513Z UseFindPeopleInPeoplePicker=False; UsePersistentCookiesForExplorerView=False; UserVoiceForFeedbackEnabled=True; 
2021-11-13T15:07:42.4838333Z Verbose=True
2021-11-13T15:07:42.4855841Z VERBOSE: [vm-agent-pool]:                            [[SPOTenantSettings]TenantSettings] Target Values: 
2021-11-13T15:07:42.4869082Z ApplyAppEnforcedRestrictionsToAdHocRecipients=True; Credential=***; FilePickerExternalImageSearchEnabled=True; 
2021-11-13T15:07:42.4875715Z HideDefaultThemes=False; IsSingleInstance=Yes; LegacyAuthProtocolsEnabled=True; 
2021-11-13T15:07:42.4888550Z MarkNewFilesSensitiveByDefault=AllowExternalSharing; MaxCompatibilityLevel=15; MinCompatibilityLevel=15; 
2021-11-13T15:07:42.4895321Z NotificationsInSharePointEnabled=True; OfficeClientADALDisabled=False; OwnerAnonymousNotification=True; 
2021-11-13T15:07:42.4906687Z PublicCdnAllowedFileTypes=CSS,EOT,GIF,ICO,JPEG,JPG,JS,MAP,PNG,SVG,TTF,WOFF; PublicCdnEnabled=False; 
2021-11-13T15:07:42.4912660Z SearchResolveExactEmailOrUPN=False; SignInAccelerationDomain=; UseFindPeopleInPeoplePicker=False; 
2021-11-13T15:07:42.4920277Z UsePersistentCookiesForExplorerView=False; UserVoiceForFeedbackEnabled=True; Verbose=True
2021-11-13T15:07:42.9516210Z VERBOSE: [vm-agent-pool]:                            [[SPOTenantSettings]TenantSettings] Test-TargetResource returned 
2021-11-13T15:07:42.9524032Z True

I think the logic that would work is:

if Test-TargetResource returns False Diff the Current vs Target resource values Store the gaps in the current vs target in a variable When all resources are tested Write output list of Test-Resource == False entities and the changes that would be made

Essentially don't run the Set-TargetResource that's in the current Deploy logic and instead output the collected diff of Current VS target in the output and exit the pipeline

andikrueger commented 7 months ago

Please review the event log or the verbose output. Both should contain this level of information. Also which parameter is out of its desired state.

ykuijs commented 7 months ago

If you run Test-DscConfiguration with the Detailed parameter, instead of outputting True/False it will output a detailed overview of all resources and whether they are or aren't in the desired state. However, it does not provide insights into what value inside of the resource is causing the fact that the resource is not in the desired state. For that you can use the M365DSC event log, like Andi suggests.

The Test/Set method you are describing in your first post is the way DSC operates: When applying a new config (using Start-DscConfiguration), DSC will always run the Test-TargetResource function first which will determine if the specific resource is in the desired state. When that is not the case, DSC will run the Set-TargetResource function to bring the resource into the desired state. It will never apply any configurations that are already in the desired state.

adhodgson1 commented 7 months ago

We have been working on different approaches for just this type of scenario for a while now. We have all come from a Terraform background, and for now we're exporting the configuration from the tenant and using the delta reports feature. This is providing the data in HTML format (which we are saving to an artifact in Azure Devops) and also as json format which we have started doing some very simple manipulation on in the pipeline to get some quick output. This is ok for components where we only have a few resources but is a bit unwieldy for components with large numbers of deployments. One thing I was potentially looking at was having a function which ran the export command with a filter that scoped to the resources in our configurations.

ricmestre commented 7 months ago

@adhodgson1 You can already achieve that running something similar to this

$Components = $(ConvertTo-DSCObject -Path $Path_To_Blueprint).ResourceName | Sort-Object -Unique
Export-M365DSCConfiguration -Components $Components ...
adhodgson1 commented 7 months ago

@ricmestre I think I got the terminology wrong here. We typically only have one set of components per configuration document. For example I have a configuration document for AAD groups we care about, a document for conditional access policies etc. Taking the AAD groups as an example, an export of all groups in a tenant can take over an hour, I just want to export the groups that are listed in the configuration and identify the changes in those using the filter option within the export cmdlet.

ricmestre commented 7 months ago

AADGroup supports filtering and so does the cmdlet Export-M365DSCConfiguration

MatthewWhiteMoJ commented 7 months ago

If you run Test-DscConfiguration with the Detailed parameter, instead of outputting True/False it will output a detailed overview of all resources and whether they are or aren't in the desired state. However, it does not provide insights into what value inside of the resource is causing the fact that the resource is not in the desired state. For that you can use the M365DSC event log, like Andi suggests.

The Test/Set method you are describing in your first post is the way DSC operates: When applying a new config (using Start-DscConfiguration), DSC will always run the Test-TargetResource function first which will determine if the specific resource is in the desired state. When that is not the case, DSC will run the Set-TargetResource function to bring the resource into the desired state. It will never apply any configurations that are already in the desired state.

@ykuijs - i tried using -detailed as a parameter but with an existing MOF got a lot of "cant find a valid parameter set that accepts this parameter in this position" messages

I think the Test/Set method that would be good to see is an option to just Test and not SET without further approval

MatthewWhiteMoJ commented 7 months ago

@adhodgson1 would be great to see some logic around this as it would help getting to the what are we changing from our live state (not just what the local MOF says it should be) and sharing this back in a human readable format for "yes I should approve this change into production.

if M365 DSC were adopted in a larger organisation and at scale then the volume of change across different components and resources will need to be tracked so that what ends up in a deployment pipeline is what is expected and that between a build and a deploy other things may have changed.

I know that PoweShell DSC != Terraform in the same way that neither are the same as Ansible or other configuration as code toolsets. What M365 DSC does really nicely is bring together the IaC / CaC capabilities that are more widely used in the digital product team space and relate it to Microsoft 365. There is a really good azuread provider for Terraform but the rest of the M365 workloads don't have the same love so M365 DSC somewhat stands out for those that want to have a more "as-code" approach to management and applying configuration to multiple tenants.

all in really like the idea of M365 DSC but feel that getting some more validation pre doing a change is what this thread is all about

MatthewWhiteMoJ commented 7 months ago

Just looked and need to test this but Start-DSCConfiguration has a -WhatIf parameter

Does this output what changes would be made and carry out the test-targetresource but without actually performing the set?

MattWhite-personal commented 7 months ago

done a load more digging on this and the -WhatIf parameter doesnt work in Start-DscConfiguration

Whilst the -Detailed parameter in Test-DscConfiguration does return the resources that are not in a good state the -Verbose flag set against Test-DSCConfiguration does write to the console what is going on but not sure if / how that can be parsed to a variable for compare.

The Data that goes into the M365DSC event log is actually the output that would be Ideal to return at the end of the Test-DscConfiguration

<M365DSCEvent>
    <ConfigurationDrift Source="MSFT_EXOMailTips" TenantId="xxx.onmicrosoft.com">
        <ParametersNotInDesiredState>
            <Param Name="MailTipsLargeAudienceThreshold"><CurrentValue>25</CurrentValue><DesiredValue>26</DesiredValue></Param>
        </ParametersNotInDesiredState>
    </ConfigurationDrift>
    <DesiredValues>
        <Param Name ="Organization">xxx.onmicrosoft.com</Param>
        <Param Name ="MailTipsAllTipsEnabled">True</Param>
        <Param Name ="MailTipsGroupMetricsEnabled">True</Param>
        <Param Name ="MailTipsLargeAudienceThreshold">26</Param>
        <Param Name ="MailTipsMailboxSourcedTipsEnabled">True</Param>
        <Param Name ="MailTipsExternalRecipientsTipsEnabled">False</Param>
        <Param Name ="Ensure">Present</Param>
        <Param Name ="ApplicationId">48e3eb7b-44b8-4ab0-a0ff-82c8cd427131</Param>
        <Param Name ="TenantId">xxx.onmicrosoft.com</Param>
        <Param Name ="CertificateThumbprint">xxx</Param>
        <Param Name ="Verbose">True</Param>
    </DesiredValues>
    <CurrentValues>
        <Param Name ="CertificatePassword">$null</Param>
        <Param Name ="ApplicationId">48e3eb7b-44b8-4ab0-a0ff-82c8cd427131</Param>
        <Param Name ="Organization">xxx.onmicrosoft.com</Param>
        <Param Name ="MailTipsAllTipsEnabled">True</Param>
        <Param Name ="CertificateThumbprint">xxx</Param>
        <Param Name ="Credential">$null</Param>
        <Param Name ="Managedidentity">False</Param>
        <Param Name ="TenantId">xxx.onmicrosoft.com</Param>
        <Param Name ="Ensure">Present</Param>
        <Param Name ="CertificatePath">$null</Param>
        <Param Name ="MailTipsLargeAudienceThreshold">25</Param>
        <Param Name ="MailTipsMailboxSourcedTipsEnabled">True</Param>
        <Param Name ="MailTipsGroupMetricsEnabled">True</Param>
        <Param Name ="MailTipsExternalRecipientsTipsEnabled">False</Param>
    </CurrentValues>

Ideally a M365DSC Cmdlet that performs the test and writes what goes to event log to screen would be awesome as a workaround - a script calling test-DscResource and then get last 1 event log and parse the XML for ParamatersNotInDesiredState and write the output to screen would work

MattWhite-personal commented 7 months ago

Dome a bit more looking at this and the code below really isn't pretty but for a bit of a hack at the various pieces of code it outputs a gap analysis for the resources that are not in the valid state.

try
{
    Write-Log -Message "Running test deployment of MOF file for environment '$Environment'"
    $test = Test-DscConfiguration -Path $envPath -Verbose -Wait
}
catch
{
    Write-Log -Message 'MOF Deployment Failed!'
    Write-Log -Message "  Error occurred during deployment: $($_.Exception.Message)"
}

if ($test.InDesiredState) {
    Write-Log -Message ' '
    Write-Log -Message '*********************************************************'
    Write-Log -Message "*       No changes detected in current State file.      *"
    Write-Log -Message '*********************************************************'
    Write-Log -Message ' '
    exit 0
}
else {
    #Get the resources that are not in the desired state
    $notInDesiredState = $test.ResourcesNotInDesiredState.resourcename | Group-Object
    foreach ($resource in $notInDesiredState) {
        # Concatenate the resouces that are not in desired state for log output
        $configurations = ($test.ResourcesNotInDesiredState | Where-Object resourceName -eq $resource.Name | Select-Object InstanceName).instancename -join ", "
        Write-Log -Message "$($resource.count) instances of $($resource.name) not in Desired State: $($configurations)"
        # Get top n event logs to output the configuration drift
        $logs = Get-EventLog -LogName M365DSC -Source "MSFT_$($resource.Name)" -newest $resource.Count
        foreach ($log in $logs) {
            #Convert the event log data to XML variable
            $xml = [xml]$log.Message
            # Dump the variable of not in valid state to console
            $xml.M365DSCEvent.ConfigurationDrift.ParametersNotInDesiredState.param | Format-Table
        }
    }
}

Running this against the sample configuration template I get the following output

[2024-03-27 20:22:41] - 1 instances of EXOMailTips not in Desired State: EXOMailTips

Name                           CurrentValue DesiredValue
----                           ------------ ------------
MailTipsLargeAudienceThreshold 26           25

[2024-03-27 20:22:41] - 1 instances of EXOOrganizationConfig not in Desired State: EXOOrganizationConfig

Name                               CurrentValue DesiredValue
----                               ------------ ------------
DefaultMinutesToReduceLongEventsBy 10           15
MailTipsLargeAudienceThreshold     26           25

[2024-03-27 20:22:42] - 2 instances of AADUser not in Desired State: ConfigureJohnSMith, ConfigureJohnSMith2

Name              CurrentValue DesiredValue
----              ------------ ------------
City                           Gatineau
Ensure            Absent       Present
FirstName                      John
LastName                       Smith2
UserPrincipalName              John.Smith2@xxx.onmicrosoft.com
UsageLocation                  US
Office                         Ottawa - Queen
Country                        Canada
DisplayName                    John L. Smith

Name              CurrentValue DesiredValue
----              ------------ ------------
City                           Gatineau
Ensure            Absent       Present
FirstName                      John
LastName                       Smith
UserPrincipalName              John.Smith@xxx.onmicrosoft.com
UsageLocation                  US
Office                         Ottawa - Queen
Country                        Canada
DisplayName                    John J. Smith

Observations:

  1. the sequencing of the resources in ResourcesNotInDesiredState and the event logs are not in sync (unsure if they will be reverse order but am only testing at this stage)
  2. because the log entries don't contain the ResourceId or InstanceName resource its not possible to filter the event logs and better output the content to the logs
  3. there is a lot of after the processing capture, format and output that must be happening within M365DSC module to write to the event log
  4. Having a module function like Test-M365DSCConfiguration -Path $envPath could capture the output in what is returned from the function to better format without having it in the pipeline script

@ykuijs - do you think this could be a backlog item to include in a future release? @adhodgson1 - does the above logic work more efficiently than your delta report and refactoring those outputs?

MattWhite-personal commented 7 months ago

trying to not make this terraform but having an output like:

[EXOMailTips]EXOMailTips will be updated
  MailTipsLargeAudienceThreshold:  26 --> 25

[EXOOrganizationConfig]EXOOrganizationConfig will be updated
  DefaultMinutesToReduceLongEventsBy: 10 --> 15
  MailTipsLargeAudienceThreshold:     26 --> 25

[AADUser]ConfigureJohnSMith will be created
  City: Gatineau
  FirstName: John
  LastName: Smith
  UserPrincipalName: John.Smith@xxx.onmicrosoft.com
  UsageLocation: US
  Office: Ottawa - Queen
  Country: Canada
  DisplayName: John J. Smith

 [AADUser]ConfigureJohnSMith2 will be created
  City: Gatineau
  FirstName: John
  LastName: Smith2
  UserPrincipalName: John.Smith2@xxx.onmicrosoft.com
  UsageLocation: US
  Office: Ottawa - Queen
  Country: Canada
  DisplayName: John J. Smith

An output like this would be clean and easy to see the expected changes to make a call on whether to then run the deploy

adhodgson1 commented 7 months ago

@MatthewWhiteMoJ thanks for the above, I've been playing around with a similar script over the past few days and think it's heading in the right direction.

Message : Error retrieving data:

                 { Duplicate AzureAD Groups named [] exist in tenant } \ at 
                 Get-TargetResource, C:\Program Files\WindowsPowerShell\Modules\Microsoft365DSC\1.24.403.1\DscResou
                 rces\MSFT_AADGroup\MSFT_AADGroup.psm1: line 178 \ at Test-TargetResource, C:\Program Files\Windows
                 PowerShell\Modules\Microsoft365DSC\1.24.403.1\DscResources\MSFT_AADGroup\MSFT_AADGroup.psm1: line 
                 956

                 TenantId: []

I need to investigate those issues especially as we aren't seeing those errors when we run Start-DSCConfiguration from within our Azure Devops build.

I think there is a use case for developing a cmdlet within the module that can either get the relevant raw entries out of the event log, or run a test on the MOF and show a diff of the resources not in desired state to cover the use cases of people coming from a Terraform background.