Open knuterik-ballestad opened 5 months ago
Before upgrading to latest azurerm+terrform runtime, terraform failed "gracefully", allowing us to re-run the github action with the terraform apply - and then the role assignments was re-created with the correct, updated constraints.
@knuterik-ballestad Presumably, the loop in your case covers all the original owners, including the principal that is running terraform. Also, assuming your workspace has kept track of all these principals' states, when your change introduce a "replace", terraform
will remove the role assignments prior creating the new ones. That's why you saw the error.
If above assumption holds, it looks like a "shoot yourself in the foot" case. My suggestion is to at least keep the principal that runs terraform not included in the sub_owners
.
@knuterik-ballestad Presumably, the loop in your case covers all the original owners, including the principal that is running terraform. Also, assuming your workspace has kept track of all these principals' states, when your change introduce a "replace",
terraform
will remove the role assignments prior creating the new ones. That's why you saw the error.If above assumption holds, it looks like a "shoot yourself in the foot" case. My suggestion is to at least keep the principal that runs terraform not included in the
sub_owners
.
Well, the principal that runs terraform, and certain admin users are set as Owners in the management structure, and not directly on the subscription - though the subscription inherits these Owners of course.
Our script only assigns one Owner directly to the subscription - the requester of a subscription to be created. That is why terraform has trouble when updating the Owner's constraint - because instead of just adding a constraint the whole role assignment is:
So, if terraform could check also inherited Owners, and not only directly assigned, this would be solved.
We also have this issue. The admin users are inherited so it's not the last user in the group.
Hey @simone-bennett, can you elaborate about your setup?
@knuterik-ballestad If the role assignment of your current principal is assigned to the management group, how will that incur a remove of that role in this new role assignment (on the sub)?
Hey @simone-bennett, can you elaborate about your setup?
Sure thing. I've set this up in our dev tenant so I can add some screenshots.
We use terraform Azure Landing Zones and Subscription Vending. When we vend a new subscription owner rights are inherited from the root management group down to the new subscription.
In addition, we create a User Assigned Managed Identity for OIDC and grant it owner of the subscription.
When we try to destroy the subscription, I get the authorization.RoleAssignmentsClient#Delete: Failure responding to request: StatusCode=412 -- Original Error: autorest/azure: Service returned an error. Status=412 Code="CannotDeleteLastRbacAdminAssignment" Message="Cannot delete the last RBAC admin assignment
error.
It's trying to delete the User Assigned Managed identity.
This user assigned managed identity is the only directly assigned owner
.
Also, it seems like the management group association is removed before the user assigned managed identity is deleted. Not sure if that is contributing. For eg: This is a subscription that failed to destory. It has been moved back to the tenant root
but was unable to delete the User Assigned Managed Identity
If I go to the vended subscription, and add an owner
directly there, we are able to destroy that subscription using the pipeline with no issues.
We obviously don't want to assign users directly and should be able to use entra groups at the management group level to grant access to subscriptions when they are created using IAC.
As @simone-bennett stated, the culprit lies in the azurerm_management_group_subscription_association
:
Also, it seems like the management group association is removed before the user assigned managed identity is deleted.
The create order is like below:
management group ---+
+-> management group subscription association
subscription ---+
|
+----------------> role assignments
When it comes to delete, since there is no dependency between "role assignments" and "management group subscription association", they can happen concurrently. This makes the issue happen.
Ideally, there wants a dependency from the "management group subscription association" to "role assignments". With that, on deletion the "role assignments" will be deleted prior to unassociate the subscription and the management group, i.e. those inherited roles are still in the subscription, which then allows the deletion of the directly assigned roles in the sub.
I'm not an expert of ALZ, not sure if the change above will break anything else though...
I should add, this is new. We have been deploying subscription vending for 12 months and I haven't come across this before now.
@simone-bennett This is like a race issue between (all of) azurerm_role_assignment
and azurerm_management_group_subscription_association
, where it only occurs when the azurerm_management_group_subscription_association
is deleted prior to the azurerm_role_assignment
.
Another factor is that the azurerm_role_assignment
used to have caching issue, in that when it is deleted, sometimes it can still be queried.
@knuterik-ballestad If the role assignment of your current principal is assigned to the management group, how will that incur a remove of that role in this new role assignment (on the sub)?
The setup is as follows:
Then, when we modify our role filter list in our terraform vending machine (adds or subtracts a role that this directly-assigned Owner are allowed to assign to others), terraform tries to completely remove+re-apply the ownership assignment. The "Remove owner" step then fails with an error message stating that the last owner of the subscription cannot be removed.
Also, this worked just fine up until recently, but after upgrading terraform, CAF version, terraform providers, ++ this error was introduced. We did not disover the bug during our testing of any of these upgrades since our test cases did not include modifying this Ownership "filter" list.
@knuterik-ballestad The condition
is a ForceNew attribute since at least year 2021. So the " remove+re-apply the ownership assignment" is the behavior for a long term.
Per my test, as long as you have owner in this subscription (no matter directly assigned or inherited), you can remove the other owners.
E.g. I have the following role assignments in my sub:
I'm then able to delete the "magodotf" app role assignments:
So in your case given you have those 2 central admins, how do you get that error message?
Besides, I noticed that the condition
is actually not necessary to be marked as ForceNew, as I find the Portal simply makes a PUT
to do the update.
It seems more that it wont accept the inherited admins from the root\highest level management group. You have to assign an admin directly on the subscription before deleting it. Even if there are admins that it inherits.
@simone-bennett Per my test (as shown above), the inherited admin can remove the last directly assigned admin from a subscription. Would you try out on your case and share the failure?
Is there an existing issue for this?
Community Note
Terraform Version
AzureRM Provider Version
~> 3.104.2
Affected Resource(s)/Data Source(s)
azurerm
Terraform Configuration Files
Debug Output/Panic Output
Expected Behaviour
We expect terraform to either be able to modify the role assignment or fail gracefully.
Actual Behaviour
Terraform removes Owner from all ALZ-subscriptions, and THEN fails in a state that doesn't even let us re-run TF to apply the role assignmens again.
Steps to Reproduce
This image shows in the Portal what we are trying to do (adding or removing "Constrain roles"), meaning what roles the Owner are allowed to assign to others.
Important Factoids
No response
References
No response