hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.51k stars 4.6k forks source link

azurerm_role_assignment - role_definition_id read wrong format for custom role definition defined at tenant level #13993

Open jhauray opened 2 years ago

jhauray commented 2 years ago

Community Note

Terraform (and AzureRM Provider) Version

Affected Resource(s)

Terraform Configuration Files

data "azurerm_role_definition" "custom_role_definition_mg_level" {
  name = "my_custom_role"
}

data "azuread_group" "user_group" {
  display_name     = "my_admin_group"
  security_enabled = true
}

resource "azurerm_resource_group" "my_rg" {
  name     = "rg-test-role_assignment"
  location = "francecentral"
}

resource "azurerm_role_assignment" "my_users_rbac" {
  scope              = azurerm_resource_group.my_rg.id
  role_definition_id = data.azurerm_role_definition.custom_role_definition_mg_level.id
  principal_id       = data.azuread_group.user_group.object_id
}

Debug Output

Panic Output

Expected Behaviour

When apply the configuration, resources must be created without error. And after, applying, running a planmust not detect any changes.

During plan, azurerm_role_assignment role_definition_id property must be readed with the right format, as describe in Microsoft documentation https://docs.microsoft.com/en-us/rest/api/authorization/role-definitions/get-by-id, especially for tenant level role definitions :

The fully qualified role definition ID. Use the format, /subscriptions/{guid}/providers/Microsoft.Authorization/roleDefinitions/{roleDefinitionId} for subscription level role definitions, or /providers/Microsoft.Authorization/roleDefinitions/{roleDefinitionId} for tenant level role definitions.

Actual Behaviour

When applying the configuration, resources are created without errors, including role assignment. But immediatly after that, when I run a plan, Terraform detect a change on azurerm_role_assignment resource, and want to replace :

Terraform will perform the following actions:

  # azurerm_role_assignment.my_users_rbac must be replaced
-/+ resource "azurerm_role_assignment" "my_users_rbac" {
      ~ id                               = "/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/rg-test-role_assignment/providers/Microsoft.Authorization/roleAssignments/f830d05e-c90a-7a50-74d3-04c14d1da67a" -> (known after apply)
      ~ name                             = "f830d05e-c90a-7a50-74d3-04c14d1da67a" -> (known after apply)
        principal_id                     = "1ffb152e-4587-4492-a269-32273f8905a4"
      ~ principal_type                   = "Group" -> (known after apply)
      ~ role_definition_id               = "/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/providers/Microsoft.Authorization/roleDefinitions/5ad69189-8f04-4325-89cb-0d38250c7611" -> "/providers/Microsoft.Authorization/roleDefinitions/5ad69189-8f04-4325-89cb-0d38250c7611" # forces replacement
      ~ role_definition_name             = "my_custom_role" -> (known after apply)
        scope                            = "/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/rg-test-role_assignment"
      + skip_service_principal_aad_check = (known after apply)
    }

Plan: 1 to add, 0 to change, 1 to destroy.

We could see role_definition_id is not readed with the right format : It is always saved with subscription level role definitions format (/subscriptions/{guid}/providers/Microsoft.Authorization/roleDefinitions/{roleDefinitionId}). but in my case, I use a tenant level role definition (/providers/Microsoft.Authorization/roleDefinitions/{roleDefinitionId}).

Terraform is continuously recreating azurerm_role_assignment resources, using tenant level custom role definition.

Steps to Reproduce

  1. terraform apply
  2. terraform plan
  3. See the plan, containing a "replace" in the output.

Important Factoids

References

aristosvo commented 2 years ago

Hi @jhauray! It would be amazing to fix this.

I would love to have a reproduction scenario with az cli or PowerShell to analyse where our implementation goes wrong, is that possible? I want to verify this is possible at all with the REST API we're using, it seems an upstream issue.

jhauray commented 2 years ago

Hi @aristosvo,

I made some tests with az cli. And I discovered strange thinks.

Response is not the same when I query role definition with and without scope :

user@Azure:~$ az role definition list --name 'my_custom_role'
[
  {
    "assignableScopes": [
      "/providers/Microsoft.Management/managementgroups/my_mg",
      "/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx"
    ],
    "description": "My Custom Role",
    "id": "/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/providers/Microsoft.Authorization/roleDefinitions/5ad69189-8f04-4325-89cb-0d38250c7611",
    "name": "5ad69189-8f04-4325-89cb-0d38250c7611",
    "permissions": [
      {
        "actions": [
          "Microsoft.Web/hostingEnvironments/*/read",
          "Microsoft.Web/hostingEnvironments/Join/Action",
          "Microsoft.Web/hostingEnvironments/Write"
        ],
        "dataActions": [],
        "notActions": [],
        "notDataActions": []
      }
    ],
    "roleName": "my_custom_role",
    "roleType": "CustomRole",
    "type": "Microsoft.Authorization/roleDefinitions"
  }
]

user@Azure:~$ az role definition list --name 'my_custom_role' --scope '/providers/Microsoft.Management/managementgroups/my_mg'
[
  {
    "assignableScopes": [
      "/providers/Microsoft.Management/managementgroups/my_mg",
      "/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx"
    ],
    "description": "Required for scaling ASP by example tenant level",
    "id": "/providers/Microsoft.Authorization/roleDefinitions/5ad69189-8f04-4325-89cb-0d38250c7611",
    "name": "5ad69189-8f04-4325-89cb-0d38250c7611",
    "permissions": [
      {
        "actions": [
          "Microsoft.Web/hostingEnvironments/*/read",
          "Microsoft.Web/hostingEnvironments/Join/Action",
          "Microsoft.Web/hostingEnvironments/Write"
        ],
        "dataActions": [],
        "notActions": [],
        "notDataActions": []
      }
    ],
    "roleName": "my_custom_role",
    "roleType": "CustomRole",
    "type": "Microsoft.Authorization/roleDefinitions"
  }
]

Returned Ids are different : /subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/providers/Microsoft.Authorization/roleDefinitions/5ad69189-8f04-4325-89cb-0d38250c7611 vs /providers/Microsoft.Authorization/roleDefinitions/5ad69189-8f04-4325-89cb-0d38250c7611.

After that, I tried to create assignements, obtains Ids.

With only the roleDefinitionId GUID :

user@Azure:~$ az role assignment create --assignee '1ffb152e-4587-4492-a269-32273f8905a4' --role '5ad69189-8f04-4325-89cb-0d38250c7611' --scope '/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/rg-test-role_assignment'
{
  "canDelegate": null,
  "condition": null,
  "conditionVersion": null,
  "description": null,
  "id": "/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/rg-test-role_assignment/providers/Microsoft.Authorization/roleAssignments/b29efac3-5ce7-4d4c-b705-5872b928da6e",
  "name": "b29efac3-5ce7-4d4c-b705-5872b928da6e",
  "principalId": "1ffb152e-4587-4492-a269-32273f8905a4",
  "principalType": "Group",
  "resourceGroup": "rg-test-role_assignment",
  "roleDefinitionId": "/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/providers/Microsoft.Authorization/roleDefinitions/5ad69189-8f04-4325-89cb-0d38250c7611",
  "scope": "/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/rg-test-role_assignment",
  "type": "Microsoft.Authorization/roleAssignments"
}

👍 it works.

With the fully qualified roleDefinitionId, in tenant level format :

user@Azure:~$ az role assignment create --assignee '1ffb152e-4587-4492-a269-32273f8905a4' --role '/providers/Microsoft.Authorization/roleDefinitions/5ad69189-8f04-4325-89cb-0d38250c7611' --scope '/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/rg-test-role_assignment'
Role '/providers/Microsoft.Authorization/roleDefinitions/5ad69189-8f04-4325-89cb-0d38250c7611' doesn't exist.

👎 it failed.

With only the fully qualified roleDefinitionId, in subscription level format :

user@Azure:~$ az role assignment create --assignee '1ffb152e-4587-4492-a269-32273f8905a4' --role '/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/providers/Microsoft.Authorization/roleDefinitions/5ad69189-8f04-4325-89cb-0d38250c7611' --scope '/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/rg-test-role_assignment'
{
  "canDelegate": null,
  "condition": null,
  "conditionVersion": null,
  "description": null,
  "id": "/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/rg-test-role_assignment/providers/Microsoft.Authorization/roleAssignments/60f642fa-e39c-425b-a3e4-7b13b952159a",
  "name": "60f642fa-e39c-425b-a3e4-7b13b952159a",
  "principalId": "1ffb152e-4587-4492-a269-32273f8905a4",
  "principalType": "Group",
  "resourceGroup": "rg-test-role_assignment",
  "roleDefinitionId": "/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/providers/Microsoft.Authorization/roleDefinitions/5ad69189-8f04-4325-89cb-0d38250c7611",
  "scope": "/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/rg-test-role_assignment",
  "type": "Microsoft.Authorization/roleAssignments"
}

👍 it works.

It seems that the roleDefinitionId must be scoped with the used subscription.

aristosvo commented 2 years ago

@jhauray Thanks for your effort, couldn't have done it better.

With this in mind, what would you suggest to take this forward? In my opinion there is only one real option: to simply ignore the diff between role_definition_id within certain conditions:

# semi-code
# "/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx/providers/Microsoft.Authorization/roleDefinitions/5ad69189-8f04-4325-89cb-0d38250c7611" == "/subscriptions/xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx" + "/providers/Microsoft.Authorization/roleDefinitions/5ad69189-8f04-4325-89cb-0d38250c7611"
if roleDefinitionIdReturned == "/subscriptions/<subscriptionId>" + roleDefinitionId {
  # no diff!
}
jhauray commented 2 years ago

Hi @aristosvo,

I understood how it work : Role definition must be readed at scope where it will be used. Even if role definition has been created at higher level :

If I apply this, my sample code must be updated like this :

resource "azurerm_resource_group" "my_rg" {
  name     = "rg-test-role_assignment"
  location = "francecentral"
}

data "azurerm_role_definition" "custom_role_definition_mg_level" {
  name = "my_custom_role"
  scope       = azurerm_resource_group.my_rg.id #definition readed at scope used by azurerm_role_assignment
}

data "azuread_group" "user_group" {
  display_name     = "my_admin_group"
  security_enabled = true
}

resource "azurerm_role_assignment" "my_users_rbac" {
  scope              = azurerm_resource_group.my_rg.id
  role_definition_id = data.azurerm_role_definition.custom_role_definition_mg_level.id
  principal_id       = data.azuread_group.user_group.object_id
}

And I have no more "replace" detected after subsequent terraform plan.

💡 I think the only need is to specify importance or definition read level, when using a custom role in documentation.

fotto commented 6 months ago

The solution suggested by @jhauray only works if the role definition is taken from a "data" resource.

The issue that terraform wants to recreate the role assignment on each run still persists of the role definition as part of the same project.

Hence I would still consider this as bug (Possible workaround shown at the end of this comment)

To illustrate the problem I used the following terraform code:

data "azurerm_client_config" "this" {
}

data "azurerm_management_group" "parent_mg" {
  name = "<mg group where this subscription belongs to>"
}

resource "azurerm_role_definition" "ra_bug" {
  name        = "test_role1"
  scope       = data.azurerm_subscription.this.id
  description = "definition at subscription level"

  permissions {
    actions     = [
      "Microsoft.Network/loadBalancers/read"
    ]
  }
  assignable_scopes = [
    data.azurerm_subscription.this.id
  ]
}

resource "azurerm_role_definition" "ra_bug2" {
  name        = "test_role2"
  scope       = data.azurerm_management_group.parent_mg.id
  description = "definition at management group level"

  permissions {
    actions     = [
      "Microsoft.Network/loadBalancers/read"
    ]
  }
  assignable_scopes = [
    data.azurerm_management_group.parent_mg.id, data.azurerm_subscription.this.id
  ]
}

resource "azurerm_resource_group" "ra_bug" {
  name = "rg_bug_test_rg"
  location = "westeurope"
}

resource "azurerm_role_assignment" "ra_bug" {
  scope                = azurerm_resource_group.ra_bug.id
  role_definition_id   = azurerm_role_definition.ra_bug.role_definition_resource_id
  principal_id         = data.azurerm_client_config.this.object_id
  description          = "test assignment of role defined within subscription"
}

resource "azurerm_role_assignment" "ra_bug2" {
  scope                = azurerm_resource_group.ra_bug.id
  role_definition_id   = azurerm_role_definition.ra_bug2.role_definition_resource_id
  principal_id         = data.azurerm_client_config.this.object_id
  description          = "test assignment of role defined at management group level"
}

The reason for this is:

Possible Workaround: modify the second role assignment (using the role definition from mg level) to make the role_definition_id match the expected value:

resource "azurerm_role_assignment" "ra_bug2" {
  scope                = azurerm_resource_group.ra_bug.id
  #role_definition_id   = azurerm_role_definition.ra_bug2.role_definition_resource_id
  # use constructed value here to prevent recreation:
  role_definition_id = "${data.azurerm_subscription.this.id}${azurerm_role_definition.ra_bug2.role_definition_resource_id}"
  principal_id         = data.azurerm_client_config.this.object_id
  description          = "test assignment of role defined at management group level"
}