DeploymentStackDeleteResourcesFailed for CosmosDB RBAC role assignments

mike-vosskuhler commented 1 day ago

Describe the bug We deleted some CosmosDB RBAC role assignments from our template, however, the deployment stack failed with the "DeploymentStackDeleteResourcesFailed" error. When I manually inspected the role assignments for this particular CosmosDB instance the role seemed to no longer be present. This means that most likely the role assignment was already cleaned up in a previous deployment, since me and my team do not touch these configurations. We did have one deployment failure prior to this (due to a templating error on my part), so it could be that it was already cleaned up then but not removed from the deployment stack managed resources list.

Please note that CosmosDB has implemented their own RBAC, meaning that this is not a standard Entra ID RBAC role assignment. This is the ARM type of the role assignment: Microsoft.DocumentDB/databaseAccounts/sqlRoleAssignments

To Reproduce Steps to reproduce the behavior:

Create a deployment stack that deploys a CosmosDB database + a CosmosDB role assignment
Delete the role assignment manually (to simulate the deleted resource as I see in my stack)
Delete the role assignment from the template and redeploy the deployment stack

Expected behavior Ideally these role assignments would be deleted, or if they already no longer exist they can be considered already deleted by the deployment stack. Which means that it can be remove from the managed resources list in the deployment stack.

I am not sure how we ended up with deleted role assignments while the stack was still tracking them. I suspect that it has something to do with failed deployments prior to the deployment where we see this behavior.

Screenshots NA

Repro Environment Host OS: Ubuntu 22.04.5 LTS Azure CLI Version: 2.66.0

Server Debugging Information Correlation ID: 9e885c4b-d9a5-4c77-bb6b-23c82da9a6b1 Tenant ID: Timestamp of issue (please include time zone): Dec 2, 2024, 8:41 PM CET Data Center (eg, West Central US, West Europe): North Europe

Additional context NA

snarkywolverine commented 18 hours ago

This might be the same as issue #194, which has a fix ready but won't be released until after the holidays.

mike-vosskuhler commented 17 hours ago

@snarkywolverine Yeah, this indeed looks similar. Please do note that we are using the CosmosDB specific role assignment resource type: Microsoft.DocumentDB databaseAccounts/sqlRoleAssignments. Which is documented here: https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/security/how-to-grant-data-plane-role-based-access.

Basically the CosmosDB team has implementated their own RBAC, which is still using Entra ID for authentication, but they are managing the built-in roles and role assignments using a separate API endpoint / Azure CLI commands. I can test if the fix works for this as well once its released.

Azure / deployment-stacks

DeploymentStackDeleteResourcesFailed for CosmosDB RBAC role assignments #197