Azure / deployment-stacks

Contains Deployment Stacks CLI scripts and releases
MIT License
90 stars 7 forks source link

DeploymentStackDeleteResourcesFailed when role assignment is deleted outside deployment stack #194

Open rsegers opened 2 weeks ago

rsegers commented 2 weeks ago

Describe the bug When a role assignment is managed with Deployment Stacks and it is deleted first outside the deployment stack (e.g. by "hand") and is then deleted using the deployment stack, the deployment stack fails with state DeploymentStackDeleteResourcesFailed after ~3 hours

I've been able to reproduce this with deployments on management groups and subscriptions. I haven't tried on the resource group level, or with other resources.

It is not listed as an known issue/limition here https://learn.microsoft.com/en-us/azure/azure-resource-manager/bicep/deployment-stacks?tabs=azure-powershell#known-limitations There is no issue for this in this GitHub repository.

Full error message:

{
  "code": "DeploymentStackDeleteResourcesFailed",
  "message": "One or more resources could not be deleted. Correlation id: 'd665b0cb-be6c-48a4-bd1b-2225899c9e89'.",
  "details": [
    {
      "code": "DeploymentStackDeleteResourcesFailed",
      "message": "An error occurred while deleting resources. These resources are still present in the stack but can be deleted manually. Please see the FailedResources property for specific error information. Deletion failures that are known limitations are documented here: https://aka.ms/DeploymentStacksKnownLimitations"
    }
  ]
}

To Reproduce Steps to reproduce the behavior:

  1. Create a Bicep file with the following role assignment:
    
    targetScope = 'subscription'

resource role1 'Microsoft.Authorization/roleAssignments@2022-04-01' = { name: guid(subscription().id, 'acdd72a7-3385-48ef-bd42-f606fba81ae7', 'aa31fd6e-1e40-49b3-874e-8c648a3a25b4') properties: { principalId: 'aa31fd6e-1e40-49b3-874e-8c648a3a25b4' roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', 'acdd72a7-3385-48ef-bd42-f606fba81ae7') } }


3. Deploy the role assignment using a deployment stack:
```bash
az stack sub create --name roletest --action-on-unmanage deleteAll --template-file test.bicep --location westeurope --deny-settings-mode none
  1. Delete the role assignment using the Azure Portal (or PowerShell, or az cli, ...)
  2. Remove the role assignment from the Bicep file and deploy the deployment stack with the same command
  3. The Deployment Stack deployment is stuck in the state deletingResources for ~3 hours
  4. The deployment fails with the error DeploymentStackDeleteResourcesFailed

Expected behavior The role assignment is deleted from the deployment stack, because the resource is already deleted outside the stack

Screenshots

Repro Environment Host OS: Windows 11 22631.4317 Powershell Version:

{
  "azure-cli": "2.63.0",
  "azure-cli-core": "2.63.0",
  "azure-cli-telemetry": "1.1.0",
  "extensions": {}
}

Server Debugging Information Correlation ID: d665b0cb-be6c-48a4-bd1b-2225899c9e89 Tenant ID: e435f79b-e4f4-4c0e-a193-deaf2abb2838 Timestamp of issue (please include time zone): 20241105T2048273248Z Data Center (eg, West Central US, West Europe): West Europe

Additional context Add any other context about the problem here.

azcloudfarmer commented 2 weeks ago

Hi @rsegers - we are looking into this issue. Thank you for sharing the correlationID and details.

snarkywolverine commented 1 week ago

Hi @rsegers - Just wanted to let you know that we have identified the issue and are working on a fix. For a little background, the issue only occurs with some resource types, and if all resources have already been deleted.

As of now, we can't promise a release date for the fix as the holidays approach, but we will continue to keep this issue updated as the change rolls out.

Thanks again for your help reporting this!