Azure / AzOps

AzOps is a PowerShell module which deploys (Push) ARM Resource Templates & Bicep files at all Azure scope levels and exports (Pull) ARM resource hierarchy.
https://aka.ms/AzOps
MIT License
384 stars 163 forks source link

Duplicate conflicting deployments performed when enabling DeployAllMultipleTemplateParameterFiles and ParallelDeployMultipleTemplateParameterFiles #886

Closed Xitric closed 3 months ago

Xitric commented 4 months ago

Describe the bug

In our AzOps repository, we have enabled both AllowMultipleTemplateParameterFiles, DeployAllMultipleTemplateParameterFiles and ParallelDeployMultipleTemplateParameterFiles. This works great the majority of the time, but sometimes it causes deployments to fail.

Firstly, we noticed that the validation pipeline would report the What-If results on our deployments twice:

WhatIf Results for main.json with main.instance01.parameters.json:
...

WhatIf Results for main.json with main.instance01.parameters.json:
...

In the pipeline output we can also see that it is processing a separate deployment for both the template and the parameter file:

[[14:20:15][New-AzOpsDeployment] Processing deployment AzOps-main.instance01-B9EE for template /__w/1/s/root/mg (e3e8e9e9-ee3c-4b13-abd2-12efc9f08498)/sub (839f6ab6-4e09-4489-a2ff-b5ce962a600c)/main.json with parameter "/__w/1/s/root/mg (e3e8e9e9-ee3c-4b13-abd2-12efc9f08498)/sub (839f6ab6-4e09-4489-a2ff-b5ce962a600c)/main.instance01.parameters.json" in mode Incremental
14:20:15][New-AzOpsDeployment] Processing deployment AzOps-main.instance01.parameters-B9EE for template /__w/1/s/root/mg (e3e8e9e9-ee3c-4b13-abd2-12efc9f08498)/sub (839f6ab6-4e09-4489-a2ff-b5ce962a600c)/main.json with parameter "/__w/1/s/root/mg (e3e8e9e9-ee3c-4b13-abd2-12efc9f08498)/sub (839f6ab6-4e09-4489-a2ff-b5ce962a600c)/main.instance01.parameters.json" in mode Incremental
Getting the latest status of all resources...Getting the latest status of all resources...

Sometimes this succeeds, and other times it fails with the error:

Get-AzDeploymentWhatIfResult: The process cannot access the file '/__w/1/s/root/mg (e3e8e9e9-ee3c-4b13-abd2-12efc9f08498)/sub (839f6ab6-4e09-4489-a2ff-b5ce962a600c)/main.instance01.parameters.json' because it is being used by another process.

In the case described above, only the Bicep template was modified and not the corresponding parameter file.

Importantly, it should be noted that so far, we have only observed this behaviour when we have a 1-1 relationship between Bicep templates and parameter files. We perform a large number of deployments where we have numerous parameter files per Bicep template, and in these cases we haven't observed duplicate deployments.

TL;DR: If we change a Bicep template that is associated with multiple parameter files, deployments are only processed for the parameter files and not the Bicep template itself. If we change a Bicep template that is associated with a single parameter file, deployments are only processed for both the parameter file and the Bicep template itself.

Steps to reproduce

  1. Enable AllowMultipleTemplateParameterFiles, DeployAllMultipleTemplateParameterFiles and ParallelDeployMultipleTemplateParameterFiles
  2. Ensure that ThrottleLimit is at least 2
  3. Create a Bicep template along with a single associated parameter file
  4. Create a PR that modifies the Bicep template, and observe how the validation pipeline processes the same deployment twice, in parallel
Jefajers commented 4 months ago

Hi @Xitric thanks for reporting this.

I initially had challenges trying to reproduce this result. However I have now succeeded to consistently get the same outcome "duplicate jobs/deployments".

I am however only able to reproduce this give a condition which the module does not expect because that would indicate other unexpected environment conditions.

Let me explain what I had todo to get here with the settings mentioned in the issue:

  1. I used two files initially that existed on main branch:

    1. main.bicep
    2. main.x1.bicepparam
  2. Branched out and made a change to main.bicep and created a PR, validate pipeline operates as expected with one suggested deployment involving files main.bicep and main.x1.bicepparam.

  3. Since I was not able to reproduce the described behavior and outcome I tried to introduce a unexpected situation.

    1. I made sure that a lingering conversion file from main.x1.bicepparam existed on the branch at pipeline runtime.
    2. My branch looked like this at runtime:
      1. main.bicep
      2. main.x1.bicepparam
      3. main.x1.parameters.json this file should not exist at initial runtime but temporarily be generated by the module and then discarded
    3. Give all the conditions from step 3 I was able to get into the situation described in the issue consistently every time.
      1. This happens because when the module initaly is examining the filesystem for associated template/parameter files it finds both main.x1.bicepparam and main.x1.parameters.json and queues them both up for deployment. In the end the module overwrites the parameters.json file with the bicepparam so it really is a duplicate deployment. What ever is inside the lingering file is overwritten.

From what I am able to deduce at this time it is not a bug, however I am interested in finding out why this happens from time to time in your environment it would be great if we could figure that out.

One thing that I can think of as a small compensating factor that would be simple to implement in the module and overcome the scenario I have tested every time, change how we construct the deployment job name.

What I discovered during my testing is that due to a small difference in how a bicepparam and parameteres file is treated during construction of the deployment name we end up in a place where the module perceive the overlapping jobs as different due to the unique deployment name. If we would change that, the module with current job logic would overcome the situation and prohibit duplicate deployments even if this unexpected environment condition would happen.

What do you think @Xitric should we go for the deployment name logic improvement?

Xitric commented 4 months ago

@Jefajers I will try to create a minimally reproducible example from the accelerator repo using our configurations to hopefully let you reproduce this consistently. Perhaps we have yet another configuration that is causing this, which I have overlooked.

I can assure you that we do not have any lingering parameter conversion files in git.

Jefajers commented 4 months ago

Thanks it would be very valuable to understand the given situation I appreciate your assistanceđŸ¤“.

Xitric commented 4 months ago

@Jefajers I got a lot closer to what is causing this issue for us. Here is what I did:

  1. Enable
    • Core.AllowMultipleTemplateParameterFiles
    • Core.DeployAllMultipleTemplateParameterFiles
    • Core.ParallelDeployMultipleTemplateParameterFiles
  2. Ensure ThrottleLimit is at least 2
  3. Create the following files on main
    • a.westeurope.bicep
    • a.westeurope.x1.bicepparam
    • b.westeurope.bicep
  4. Create a PR that modifies a.westeurope.bicep and b.westeurope.bicep
  5. Observe that AzOps performs two deployments of a.westeurope.json in parallel:

    [Invoke-AzOpsPush] Deployment required
    [Invoke-AzOpsPush] Adding or modifying:
    [Invoke-AzOpsPush]   root/sub (839f6ab6-4e09-4489-a2ff-b5ce962a600c)/my-resource-group/a.westeurope.bicep
    [Invoke-AzOpsPush]   root/sub (839f6ab6-4e09-4489-a2ff-b5ce962a600c)/my-resource-group/b.westeurope.bicep
    [New-AzOpsDeployment] Processing deployment AzOps-a.westeurope.x1.parameters-B9EE for template /home/vsts/work/1/s/root/sub (839f6ab6-4e09-4489-a2ff-b5ce962a600c)/my-resource-group/a.westeurope.json with parameter "/home/vsts/work/1/s/root/sub (839f6ab6-4e09-4489-a2ff-b5ce962a600c)/my-resource-group/a.westeurope.x1.parameters.json" in mode Incremental
    Getting the latest status of all resources...[13:39:58][New-AzOpsDeployment] Processing deployment AzOps-a.westeurope.x1-B9EE for template /home/vsts/work/1/s/root/sub (839f6ab6-4e09-4489-a2ff-b5ce962a600c)/my-resource-group/a.westeurope.json with parameter "/home/vsts/work/1/s/root/sub (839f6ab6-4e09-4489-a2ff-b5ce962a600c)/my-resource-group/a.westeurope.x1.parameters.json" in mode Incremental
    Getting the latest status of all resources...
    
    New-AzResourceGroupDeployment: The process cannot access the file '/home/vsts/work/1/s/root/sub (839f6ab6-4e09-4489-a2ff-b5ce962a600c)/my-resource-group/a.westeurope.x1.parameters.json' because it is being used by another process.

If I disable Core.DeployAllMultipleTemplateParameterFiles, the issue ceases to occur.

If I rename the files to something like this, the issue also ceases to occur and only one deployment is performed:

So it has something to do with how we name our templates and parameter files, and how AzOps interprets that.

Creating a file b.westeurope.x1.bicepparam does not solve the issue, so it is not caused by having a lingering template with no parameter file.

Jefajers commented 4 months ago

Hi again @Xitric, thanks given the above scenario I am now able to reproduce the result without any unexpected manipulation :relieved:.

I have an ide of how to combine the previous suggestion of streamlining the deployment name together with further validations to not only get the expected amount of deployments but even more importantly I think we might have an additional issue that could result in template pairs not even changed in the PR could be queued up for deployment which is never expected to happen.

Will update this issue with associated PR once I have a more complete suggestion of a fix :ambulance:.