Azure / Enterprise-Scale

The Azure Landing Zones (Enterprise-Scale) architecture provides prescriptive guidance coupled with Azure best practices, and it follows design principles across the critical design areas for organizations to define their Azure architecture
https://aka.ms/alz
MIT License
1.67k stars 948 forks source link

[Policy]: Do not allow deletion of resource types #1514

Closed vegazbabz closed 2 weeks ago

vegazbabz commented 8 months ago

Policy Definition or Initiative

Definition

Built-in/Custom

Built-in

Built-in policy definition or initiative ID

78460a36-508a-49a4-b2b2-2f5ec564f4bb

Custom policy definition or initiative description

A policy that is an alternative to Resource Locks for critical infrastructure components. https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ready/landing-zone/design-area/management-platform#inventory-and-visibility-recommendations "Use resource locks to prevent accidental deletion of critical shared services."

There are many known (and unknown) limitations to the use of resource locks; some are mentioned here: https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/lock-resources?tabs=json#considerations-before-applying-your-locks

For example, resource locks break Azure Backup of VMs Managed Disks, it will break Sentinel, etc.

Many of the resources are critical, e.g., Backups or Sentinel, and you do not want these resources to be deleted - purposely or accidentially. Therefore, there should be a recommendation to use the DenyAction-effect instead of resource locks. The policy should be on different scopes. E.g. Sentinel and Backups in the Management MG. E.g. Landing Zones to ensure that resource locks are not inherited for RGs. Can for instance be combined with tags for more granularity, etc.

It would be great if Microsoft could sum up more use-cases where resource lock (CanNotDelete or read-only) cannot be used. Then DenyAction is an alternative.

Built-in policy: https://www.azadvertizer.net/azpolicyadvertizer/78460a36-508a-49a4-b2b2-2f5ec564f4bb.html

Scope

Multiple / Other

Default Assignment

Comments/thoughts

No response

Springstone commented 7 months ago

@vegazbabz many thanks for submitting this suggestion using our shiny new Policy Suggestion form! ;)

This is a great suggestion, and we're looking into how to reasonably implement this for ALZ. Most likely we will look at assigning this to the Platform management group for the resources we deploy as part of ALZ only, as you have a valid point around protecting critical infrastructure. Of course, it's still not perfect as policy team can create an exemption and you can still delete the resource (just like you can work around the resource locks). The intention of resource locks is to prevent accidental deletion using the "DoNotDelete" lock, but to your point "ReadOnly" can be troublesome.

We do provide a couple of policies in ALZ to "demo" the concept of DenyAction, specifically for ActivityLogs and DiagnosticLogs, but we don't assign those by default.

This isn't necessarily the right solution for all Azure services though. For example, for Backup/Recovery Vaults a much better solution is to use Multi-User Authentication (MUA) or Immutability, so we would use this mechanism to protect those resources. Likewise, a number of critical infra resources have their own protection mechanisms that at least allow easy recovery from accidental deletion - Key Vault, APIM, etc. so it's not necessarily the right solution for all services.

That said, we're looking at applying this to our core infra as part of ALZ.

Many thanks again for your submission!

vegazbabz commented 6 months ago

Another use case that I also wrote to DfS PG is to include a DenyAction policy for the malware protection capabilities, which is deployed as a Logic App or Event Grid.

Springstone commented 2 weeks ago

Hi @vegazbabz. We've circled back to review this issue and are using this to protect the user assigned identity created for AMA. We currently won't be using the DenyAction policy for other resources due to the many assignment scopes/resources and associated complexity, as you highlight in your original post, which also provides great guidance on where using this policy is desirable. You have a great request for more information on scenarios where resource locks interfere with the proper functioning of resources. For this, may I ask that you submit feedback at https://feedback.azure.com for better guidance on this topic?

Springstone commented 2 weeks ago

Closing as no further action on ALZ team.