Azure / terraform-azurerm-avm-ptn-alz

Terraform module to deploy Azure Landing Zones
https://registry.terraform.io/modules/Azure/avm-ptn-alz/azurerm
MIT License
65 stars 15 forks source link

[AVM Module Issue]: Invalid for_each argument when using depends_on in module block #127

Open c-baumgartner opened 1 week ago

c-baumgartner commented 1 week ago

Check for previous/existing GitHub issues

Issue Type?

Bug

(Optional) Module Version

0.9.0-beta2

(Optional) Correlation Id

No response

Description

Calling ALZ module without depends_on works without an issue. But when setting a dependency to another module it will fail with the following error:

│ Error: Invalid for_each argument
│ 
│   on .terraform/modules/alz_architecture/main.management_groups.tf line 2, in resource "azapi_resource" "management_groups_level_0":
│    2:   for_each = local.management_groups_level_0
│     ├────────────────
│     │ local.management_groups_level_0 will be known only after apply
│ 
│ The "for_each" map includes keys derived from resource attributes that cannot be determined until apply, and so OpenTofu cannot determine the full set of keys that will identify the instances of this resource.
│ 
│ When working with unknown values in for_each, it's better to define the map keys statically in your configuration and place apply-time results only in the map values.
│ 
│ Alternatively, you could use the -target planning option to first apply only the resources that the for_each value depends on, and then apply a second time to fully converge.

We need to have one resource (LogAnalytics Workspace) and one resource group created upfront.

Module call that *is working:

module "alz_architecture" {
  source  = "Azure/avm-ptn-alz/azurerm"
  version = "0.9.0-beta2"

  architecture_name  = local.architecture_name
  parent_resource_id = data.azapi_client_config.current.tenant_id
  location           = local.location

  enable_telemetry             = local.enable_telemetry
  subscription_placement       = local.subscription_placement
  policy_assignments_to_modify = local.policy_assignments_to_modify
}

The following call will fail with the above error in the plan phase:

module "alz_architecture" {
  source  = "Azure/avm-ptn-alz/azurerm"
  version = "0.9.0-beta2"

  architecture_name  = local.architecture_name
  parent_resource_id = data.azapi_client_config.current.tenant_id
  location           = local.location

  enable_telemetry             = local.enable_telemetry
  subscription_placement       = local.subscription_placement
  policy_assignments_to_modify = local.policy_assignments_to_modify

  depends_on = [ 
    module.sentinel_resource_group_core,
  ]
}

This is the corresponding architecture definition:

  name: rt-infra
  management_groups:
    - archetypes:
        - root
      display_name: "RT"
      exists: false
      id: "rt"
      parent_id: null
    - archetypes:
        - gk-rt-0-0-1
      display_name: "Unstaged"
      exists: false
      id: "unstaged"
      parent_id: rt  
    - archetypes:
        - gk-rt-0-0-1
      display_name: "RT Core Services"
      exists: false
      id: "core-services"
      parent_id: rt
    - archetypes:
        - gk-rt-0-0-1
      display_name: "RT AppZones"
      exists: false
      id: "appzones"
      parent_id: rt

Addition

Commenting out the ALZ module for the first apply, and then commenting it in (including the depends_on) works.

The strange thing is again, there are no dynamic references to the outputs of the other modules. Only one hard coded resource id referencing to the log analytics workspace in the policy_assignments_to_modify

  policy_assignments_to_modify = {
    rt = {
      policy_assignments = {
        Deploy-MDFC-SqlAtp = {
          enforcement_mode = "DoNotEnforce"
        }
        Deploy-MDFC-OssDb = {
          enforcement_mode = "DoNotEnforce"
        }
        Audit-ResourceRGLocation = {
          enforcement_mode = "DoNotEnforce"
        }
        # Deploy-MDFC-Config-H224 = {
        #   parameters = {
        #     parameterName = jsonencode({ value = local.sentinal_log_analytics_workspace_module_config_by_name["SentinelCore"].name })
        #   }
        # }
        Deploy-Diag-Logs = {
          parameters = {
            logAnalytics = jsonencode({ value = "/subscriptions/${data.azapi_client_config.current.subscription_id}/resourcegroups/${local.sentinel_core_resource_group_name}/providers/Microsoft.OperationalInsights/workspaces/${local.sentinel_automation_account_name}" })
          }
        }
        Deploy-AzActivity-Log = {
          parameters = {
            logAnalytics = jsonencode({ value = "/subscriptions/${data.azapi_client_config.current.subscription_id}/resourcegroups/${local.sentinel_core_resource_group_name}/providers/Microsoft.OperationalInsights/workspaces/${local.sentinel_automation_account_name}" })
          }
        }
      }
   }
....

The two locals are just strings. I have also tried to make the azapi_client_config a pre-req to the ALZ by adding it in the depends_on list, too

matt-FFFFFF commented 6 days ago

Hi @c-baumgartner

This is a known issue with the provider at the moment. We are eagerly awaiting developments on deferred actions.

Until that time you can't use the module in the way you have done.

The module does not need to be dependent on the RG and sentinel deployment if you know the names of the resources you can feed them in as string literals.

The ordering is typically not strictly important as the policy assignment will take some time to be effective anyway.

It's not ideal, I realise, but this is where we are at this time.

RR

c-baumgartner commented 5 days ago

Hi @matt-FFFFFF,

thank you for your feedback. Good to know that there is something on the roadmap to fix this.

In regard to the ordering, this was also our plan to not have a dependency on the alz module. But unfortunately, when I construct the resource id and try to apply the code without creating the log analytics workspace first, I get these errors:

│ Error: Failed to create/update resource
│ 
│   with module.alz_architecture.azapi_resource.policy_role_assignments["e8a47a4b-c2a6-5cf9-a06d-be0a82864931"],
│   on .terraform/modules/alz_architecture/main.policy_role_assignments.tf line 1, in resource "azapi_resource" "policy_role_assignments":
│    1: resource "azapi_resource" "policy_role_assignments" {
│ 
│ creating/updating Resource: (ResourceId "/subscriptions/44a36e73-14b6-423f-9fb3-672ae9c6376d/resourceGroups/rg-GABcbag-SentinelCore/providers/Microsoft.OperationalInsights/workspaces/log-GABcbag-SentinelCore/providers/Microsoft.Authorization/roleAssignments/e8a47a4b-c2a6-5cf9-a06d-be0a82864931" / Api Version "2022-04-01"): PUT
│ https://management.azure.com/subscriptions/44a36e73-14b6-423f-9fb3-672ae9c6376d/resourceGroups/rg-GABcbag-SentinelCore/providers/Microsoft.OperationalInsights/workspaces/log-GABcbag-SentinelCore/providers/Microsoft.Authorization/roleAssignments/e8a47a4b-c2a6-5cf9-a06d-be0a82864931
│ --------------------------------------------------------------------------------
│ RESPONSE 404: 404 Not Found
│ ERROR CODE: ResourceNotFound
│ --------------------------------------------------------------------------------
│ {
│   "error": {
│     "code": "ResourceNotFound",
│     "message": "The Resource 'Microsoft.OperationalInsights/workspaces/log-GABcbag-SentinelCore' under resource group 'rg-GABcbag-SentinelCore' was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix"
│   }
│ }
│ --------------------------------------------------------------------------------
│ 
╵

This was the reason we tried to set the LAW up first. Another workaround would be to have some (sub)layering in our solution to deploy the LAW first and then the governance/alz module.

Do you have any other idea how to work around managed identity rbac assignment in this case?

matt-FFFFFF commented 4 days ago

SO when I deploy both the LAW gets created before the policy role assignments - this is not enforced but is typically the case because the MG and policy work takes longer.

I have done some experimentation with allow-defferal and this solves the issue, allowing breaking down into multiple plan/apply cycles.

matt-FFFFFF commented 4 days ago

RR

c-baumgartner commented 4 days ago

Hi Matt, very interesting observation that you don't run into this. I can repeatedly reproduce the issue. What I have tried was to set retries and ramp up the values to their maximum - without luck to get over the 404 ( big thanks to my colleague for having this idea with the retries ). The -allow-deferral sounds like a plan for the future but we have to wait until Tofu will ship this feature. So to make the code robust and reliable we will make some changes to the design by using a separate layer to deploy the LAW upfront (or use -target as a substitution for the -allow-deferral parameter in the meantime)

Just fyi the retries settings I have used:

  retries = {
    policy_role_assignments = {
      error_message_regex = [
        "^The Resource 'Microsoft.OperationalInsights/workspaces",
      ]
      interval_seconds     = 120
      max_interval_seconds = 300

    }
}
module.alz_architecture.azapi_resource.policy_role_assignments["d7a4caa9-2f5c-5990-a3be-3ea8ba390fe8"]: Still creating... [4m20s elapsed]
module.alz_architecture.azapi_resource.policy_role_assignments["d7a4caa9-2f5c-5990-a3be-3ea8ba390fe8"]: Still creating... [4m30s elapsed]
module.alz_architecture.azapi_resource.policy_role_assignments["d7a4caa9-2f5c-5990-a3be-3ea8ba390fe8"]: Still creating... [4m40s elapsed]
module.alz_architecture.azapi_resource.policy_role_assignments["d7a4caa9-2f5c-5990-a3be-3ea8ba390fe8"]: Still creating... [4m50s elapsed]
╷
│ Error: Failed to create/update resource
│ 
│   with module.alz_architecture.azapi_resource.policy_role_assignments["d7a4caa9-2f5c-5990-a3be-3ea8ba390fe8"],
│   on .terraform/modules/alz_architecture/main.policy_role_assignments.tf line 1, in resource "azapi_resource" "policy_role_assignments":
│    1: resource "azapi_resource" "policy_role_assignments" {
│ 
│ creating/updating Resource: (ResourceId
│ "/subscriptions/44a36e73-14b6-423f-9fb3-672ae9c6376d/resourceGroups/rg-GABcbag-SentinelCore/providers/Microsoft.OperationalInsights/workspaces/log-GABcbag-SentinelCore/providers/Microsoft.Authorization/roleAssignments/d7a4caa9-2f5c-5990-a3be-3ea8ba390fe8"
│ / Api Version "2022-04-01"): context deadline exceede
matt-FFFFFF commented 3 days ago

https://github.com/Azure/terraform-azurerm-avm-ptn-alz/pull/130/commits/2c62cfa2bc6da24766c842b426f8e20bf61c36f4

This is the example I am using and it deploys just fine. #RR

matt-FFFFFF commented 3 days ago

I think you might need to await the release of v2.0.1 of azapi as it contains some improvements to the retry