hashicorp / terraform-provider-azuread

Terraform provider for Azure Active Directory
https://registry.terraform.io/providers/hashicorp/azuread/latest/docs
Mozilla Public License 2.0
431 stars 296 forks source link

azuread_group_role_management_policy race condition results in broken state #1466

Open Shop-kins opened 1 month ago

Shop-kins commented 1 month ago

Community Note

Terraform (and AzureAD Provider) Version

Terraform Version: 1.9.5 Azuread Version: 2.53.1

Affected Resource(s)

Terraform Configuration Files

resource "azuread_group" "pim_group" {
  display_name = "PIMGROUP_TEST"
  security_enabled = true
}

resource "azuread_privileged_access_group_assignment_schedule" "group" {
  for_each =       var.list_of_groups
  group_id        = azuread_group.pim_group.id
  principal_id    = "A GROUP PRINCIPAL"
  assignment_type = "member"
  duration        = "P30D"
  justification   = "as requested"
  permanent_assignment = false
}

resource "azuread_privileged_access_group_assignment_schedule" "user" {
  for_each =       var.list_of_users
  group_id        = azuread_group.pim_group.id
  principal_id    = each.id
  assignment_type = "member"
  duration        = "P30D"
  justification   = "as requested"
  permanent_assignment = false
}

resource "azuread_group_role_management_policy" "example" {
  group_id = azuread_group.pim_group.id
  role_id  = "member"

  eligible_assignment_rules {
    expiration_required = true
    expire_after = "P365D"
  }

  activation_roles{
   maximum_duration = "P18H"
   require_approval = false
   require_justification = true
   dynamic "approval_stage" {
     for_each = if var.map_of_approvers != {} ? [1] : []
     content {
       dynamic "primary_approver" {
         for_each = var.map_of_approvers
         content {
           object_id = primary_approver.key
           type = primary_approver.value
         }
     }
   }
  }
}

Debug Output

I will try and get you one, but both times Its happened have been without debug

Expected Behavior

azuread_group_role_management_policy is created successfully or errors and does not store an id of the non existent remote object

Actual Behavior

azuread_group_role_management_policy during its create process HERE fetches the existing role and stores it.

However it fails when attempting to get that id. this is due to the role id changing when modified (as noted in the comment on line 925). and the modification is occurring in the first instance of one of the two sets of azuread_privileged_access_group_assignment_schedule.

if the modification occurs between azuread_group_role_management_policy retrieving the member role id and directly calling a get request then the resource will save the broken id and require manual intervention to correct the state file.

Error on inital failed apply

{"@level":"error","@message":"Error: Could not retrieve existing policy, RoleDefinitionsClient.BaseClient.Get(): unexpected status 404 with OData error: RoleSettingNotFound: The role setting is not found.","@module":"terraform.ui","@timestamp":"2024-09-03T15:43:27.3035522","dia gnostic": {"severity":"error","summary":"Could not retrieve existing policy, Role DefinitionsClient.BaseClient.Get(): unexpected status 404 with OData error: RoleSettingNotFound: The role setting is not found.","detail":"Could not retrieve existing policy, Role DefinitionsClient.BaseClient.Get(): unexpected status 404 with OData error: RoleSetting Not Found: The role setting is not found.","address":"module.project.module.pim-group[0].azuread_group_role_management_policy.pim_group_policy_member","range ":{"
filename":".terraform/modules/project.pim-group/group.tf","start":
{"line":27,"column":75,"byte":1000},"end":{"line":27,"column":76,"byte":1001}},"snippet": {"context":"resource
"azuread_group_role_management_policy""pim_group_policy_member"","code":"resource"azuread_group_role_management_policy" "pim_group_policy_member"{","start_line":27,"highlight_start_offset":74,"highlight_end_offset":75,"values":
[]}},"type":"diagnostic"}

Error on subsequent plans of all types

Error: retrieving Role Management Policy Assignment ID: Group_4dca9131-92ca-49a9-94a7-53bc435fbd00_3f1cc92a-37f8-4839-9ca2- b9592ed159d0: Role DefinitionsClient.BaseClient.Get(): unexpected status 404 with OData error: Role Setting NotFound: The role setting is not found. with module.project.module.pim- group[0].azuread_group_role_management_policy.pim_group_policy_r on .terraform/modules/project.pim-group/group.tf line 27, in resource "azuread_group_role_management_policy" "pim_group_policy_member":
resource "azuread_group_role_management_policy" "pim_group_policy_member" {
retrieving Role Management Policy Assignment ID: Group_4dca9131-92ca-49-9-94a7-53bc435fbd00_3f1cc92a-37f8-4839-9ca2- b9592ed159d0: Role DefinitionsClient.BaseClient.Get(): unexpected status 404 with OData error: RoleSetting NotFound: The role setting is not found.

Steps to Reproduce

Have a terraform setup similar to the above and run terraform apply and destroy over and over again until the error occurs.

Important Factoids

When having one resource always go first (be it azuread_group_role_management_policy or azuread_privileged_access_group_assignment_schedule) results in the issue never arising. It will also never affect the azuread_privileged_access_group_assignment_schedule negatively as that resource does not store the member_role_id directly.

References

Finding actual documentation on the azure behaviour is difficult, my assumption is that the default role id for member is non mutable, but on an edit attempt a mutable copy is created.

A simple solution for this would be to not save the id at that stage in the process, and error out with how this might occur as well as updating relevant documentation. Alternatively, a retry or a more sophisticated internal dependency system could be created as a last resort option just not storing the id, and looking it up each it time

I already have a simple solution for the problem now I am aware of how it arises (using terraforms build in depends_on) however I'm more than happy to assist in resolving this within the provider as I'd rather not have other people have to go through the debugging of this issue!

kenchan0130 commented 1 month ago

The azuread_privileged_access_group_assignment_schedule cannot be created without azuread_group_role_management_policy, so you have to set a dependency.

resource "azuread_privileged_access_group_assignment_schedule" "group" {
  dependes_on = [
    azuread_group_role_management_policy.example
  ]
  ...
}

resource "azuread_privileged_access_group_assignment_schedule" "user" {
  dependes_on = [
    azuread_group_role_management_policy.example
  ]
  ...
}
Shop-kins commented 1 month ago

Interesting! while that certainly makes sense logically that is not actually the case. I can create an azuread_privileged_access_group_assignment_schedule without a azuread_group_role_management_policy and the process completes successfully!

Ive also done a test where I made the azuread_group_role_management_policy dependant on the azuread_privileged_access_group_assignment_schedule which is also completely fine!