Azure / ALZ-PowerShell-Module

The Azure Landing Zones Accelerators PowerShell module
https://www.powershellgallery.com/packages/ALZ/
MIT License
61 stars 24 forks source link

Unable to remove custom roles at tenant root. #143

Closed autocloudarc closed 2 weeks ago

autocloudarc commented 2 weeks ago

When I attempt to remove the custom roles staged a the tenant root from my destroy operation, I am unable to do so and receive no detailed information of the problem.

Expected Behavior

I should be able to remove custom roles deployed by the solution, otherwise, I will receive an error when I attempt to redploy if the custom roles already exist.

Current Behavior

I receive a message in the notification area which states: "Failed to remove role definition(s) Failed to remove role with name Azure Landing Zones Subscription Owner (alz-mgmt) : An error occurred. Please try again later."

Possible Solution

Steps to Reproduce

  1. Navigate to the tenant root management group.
  2. Select IAM
  3. Select Roles tab
  4. Filter to view Custom roles
  5. From the elipses on the right of any displayed custom role, select delete.

Alternatively, run this PowerShell script in the cloud shell:

$customRoles = @("Azure Landing Zones Management Group Contributor (alz-mgmt)","Azure Landing Zones Subscription Owner (alz-mgmt)","Azure Landing Zones Subscription Reader (alz-mgmt)")
$customRoles | ForEach-Object -Process {Get-AzRoleDefinition $_ | Remove-AzRoleDefinition -Force -Verbose }

Input file:


# Basic inputs
# The Infrastructure as Code (IaC) tool to use for the deployment. (e.g. 'terraform'). NOTE: Only 'terraform' is relevant here.
iac: "terraform"
# The bootstrap module to use for version control system to use for the deployment. (e.g. 'alz_github')
bootstrap: "alz_github"
# The starter module to use for the deployment. (e.g. 'complete')
starter: "complete"

# Bootstrap inputs
# The personal access token for GitHub:
github_personal_system_access_token: <redacted>
github_organization_name: "<redacted>arc"

# Controls whether to use a separate repository to store pipeline templates. This is an extra layer of security to ensure that the azure
# credentials can only be leveraged for the specified workload
use_separate_repository_for_templates: "true"
# Azure Subscription ID for the bootstrap resources (e.g. storage account, identities, etc). Leave empty to use the az login subscription
# (A valid subscription id GUID e.g. '12345678-1234-1234-1234-123456789012')
bootstrap_subscription_id: "<redacted>23c"
# Used to build up the default resource names (e.g. rg-<service_name>-mgmt-uksouth-001) (A valid Azure name with no hyphens and limited
# length e.g. 'abcd')
service_name: "alz"
# Used to build up the default resource names (e.g. rg-alz-<environment_name>-uksouth-001) (A valid Azure name with no hyphens and limited
# length e.g. 'abcd')
environment_name: "mgmt"
# Used to build up the default resource names (e.g. rg-alz-mgmt-uksouth-<postfix_number>) (A number e.g. '1234')
postfix_number: "1"
# Controls whether to use self-hosted agents for the pipelines
use_self_hosted_agents: "true"
# Personal access token for GitHub Runners to register themselves: alz-tfm-pat-02
github_runners_personal_access_token: <redacted>                                
# Controls whether to use private networking for the agent to storage account communication
use_private_networking: "true"
# Allow access to the storage account from the current IP address. We recommend this is kept off for security
allow_storage_access_from_my_ip: "true"
# Apply stage approvers to the action / pipeline, must be a list of SPNs separate by a comma (e.g. abcdef@microsoft.com,ghijklm@microsoft.com) using team "alz-mgmt-approvers"
apply_approvers: "<redacted>@outlook.com"
# Create branch policies for the main branch
create_branch_policies: "true"

# Shared interface inputs
# Azure Deployment location for the bootstrap resources (e.g. storage account, identities, etc)
# (An Azure deployment location e.g. 'uksouth')
bootstrap_location: "eastus2"
starter_location: "eastus2"
# The root parent management group display name. This will default to 'Tenant Root Group' if not supplied
root_parent_management_group_display_name: "Tenant Root Group"
# This is the id of the management group that the ALZ hierarchy will be nested under, will default to the Tenant Root Group
# (A valid Azure name e.g. 'my-azure-name')
root_parent_management_group_id: "<redacted>8f9"
# The identifier of the Identity Subscription. (e.g '00000000-0000-0000-0000-000000000000')
# (A valid subscription id GUID e.g. '12345678-1234-1234-1234-123456789012')
subscription_id_identity: "<redacted>310"
# The identifier of the Management Subscription. (e.g 00000000-0000-0000-0000-000000000000)
# (A valid subscription id GUID e.g. '12345678-1234-1234-1234-123456789012')
subscription_id_management: "<redacted>c5f"
# The identifier of the Connectivity Subscription. (e.g '00000000-0000-0000-0000-000000000000')
# (A valid subscription id GUID e.g. '12345678-1234-1234-1234-123456789012')
subscription_id_connectivity: "<redacted>8dc"

# Starter Module Specific Variables
# The location for Azure resources. (e.g 'uksouth')
# (An Azure deployment location e.g. 'uksouth')
default_location: "eastus2"
# The default postfix for Azure resources. (e.g 'landing-zone') #
# (A valid Azure name e.g. 'my-azure-name')
default_postfix: "landing-zone"
# The path of the configuration file
# (A valid yaml or json configuration file path e.g. './my-folder/my-config-file.yaml' or `c:\\my-folder\\my-config-file.yaml`)
configuration_file_path: ""

Context (Environment)

I am practicing the deployments and rollbacks to demonstrate the functionality for customers and during the Azure Landing Zone Deployment VBD engagements.

PS /home/system> $PSVersionTable

Name Value


PSVersion 7.4.4 PSEdition Core GitCommitId 7.4.4 OS CBL-Mariner/Linux Platform Unix PSCompatibleVersions {1.0, 2.0, 3.0, 4.0…} PSRemotingProtocolVersion 2.3 SerializationVersion 1.1.0.1 WSManStackVersion 3.0

I can't delete custom roles deployed by this solution to clean up the tenant, which is a blocker for performing subsequent deployments, demonstrating for customer and for engagements.

Detailed Description

After running the "02 Azure Landing Zones Continuous Delivery " GitHub Actions workflow with the 'destroy' parameter to cleanup my previous deployments, I am unable to remove the custom roles from the tenant root. Deployment region is eastus2, although I don't believe roles are region specific resources. See screenshot below:

unable-to-delete-custom-roles-at-tenant-root

richardf5 commented 2 weeks ago

I came across this one too.

You can't remove Custom Roles until you've removed all Permission Assignments. Unfortunately, some are against individual Subscriptions. Easiest way at the moment (until the guys look at this) is to remove all Assignments, then delete the custom roles from the Root Management Group.

image

autocloudarc commented 2 weeks ago

Thanks for your response @richardf5 . My results are inconsistent. After removing these role assignments, I did get a couple deleted, but others for which I removed role assignments are unfortunately still un-deleteable.

autocloudarc commented 2 weeks ago

Update: Upon further investigation, it turns out that only the outlined custom roles deployed in the image below are un-delete-able. :-0 These also don't have any role assignments. My only workaround I have right now is to rename these roles with simple numeric or alphabetic names like 2, 3 and b, d unless some hero can save us before (we - really, is it only me?) encounter the tenant limit of 100 custom roles. Ouch!

image

jtracey93 commented 2 weeks ago

Hey @autocloudarc,

Have you ensured these roles are not assigned to anything at any scope?

By default they each get assigned at least once to the subscription (bootstrap_subscription_id) you specific to host the runners for CI/CD, you should see assignments for each of the plan and apply User-Assigned Managed Identities on the sub - please delete these role assignments

You will also have them on the root parent management group (root_parent_management_group_id) you specified, if you didnt it will be the tenant root management group in your tenant - please also remove these.

Once you have removed the assignments you should then be able to delete the role definitions.

If still no luck, can you please try deleting the role definitions and add either the -debug or -verbose switch parameters and sharing an output (redacting any sensitive information) with us here

Thanks

Jack

autocloudarc commented 2 weeks ago

Hi @jtracey93 ,

Great suggestion! Yes, after digging into the hierarchy recursively, finding and removing other role assignments, I was able to delete all of them now. Thank you so much! You truly are a hero after-all! :-)

It really would be convenient though, if there was a way to cleanup these role assignment post ALZ deployments for those of us who are practicing, learning or for PoC, hackathon situations, etc, where we expect to repeatedly and iteratevely deploy the ALZ multiple times to the same tenant, but under slighty different conditions and with various solutions (Azure DevOps, GitHub Public, GitHub Enterprise, With Self-Hosted Runners, With GitHub runners, etc). Maybe someone can come up with a click button Azure Function to do that for these post-deployment cleanup operations? Anyway, thanks again @jtracey93 and team for all your hard work and community support.

jaredfholgate commented 1 week ago

@autocloudarc Running Deploy-Accelerator with the -destroy parameter will remove these role assignments and definitions. Run this after you have run the destroy CD pipeline to clean everything up.

You'll need to target the same output folder as it requires the Terraform state file generation from the initial deployment.