Azure / terraform-azurerm-lz-vending

Terraform module to deploy landing zone subscriptions (and much more) in Azure
https://registry.terraform.io/modules/Azure/lz-vending/azurerm
MIT License
161 stars 70 forks source link

bug: Onboarding subscriptions created outside the module is destructive #386

Open csmith66 opened 2 months ago

csmith66 commented 2 months ago

Community Note

Versions

Please paste the output of terraform version command from within the initialized directory:

Terraform v1.8.4
on windows_amd64
+ provider registry.terraform.io/azure/azapi v1.13.1
+ provider registry.terraform.io/hashicorp/azurerm v3.106.1
+ provider registry.terraform.io/hashicorp/time v0.11.2

Please enter the module version that you are using:

4.1.2

Description

When trying to on-board existing subscriptions into the module for resource management (tags, display name, etc.), imports result in destructive behavior. Short of manually creating a Terraform state file by hand, this effectively prevents the ability to manage subscriptions that may have been created outside the module.

Steps to Reproduce

  1. Create a root module that includes both an import block and the terraform-azurerm-lz-vending module
  2. Execute terraform plan
import {
  id = "/providers/Microsoft.Subscription/aliases/my-sub-123"
  to = module.lz_vending_subscription.module.subscription[0].azurerm_subscription.this[0]
}

module "lz_vending_subscription" {
  source                     = "../Modules/terraform-azurerm-lz-vending/4.1.2"

  subscription_billing_scope = "<appropriate_billing_scope>"
  subscription_id            = "" # Managing subscription, not Management Groups
  subscription_alias_enabled = true
  subscription_alias         = "my-sub-123"
  subscription_display_name  = "My Subscription 123"
  subscription_workload      = "Production"
  location                   = var.sub_location

  subscription_management_group_association_enabled = false
  virtual_network_enabled                           = false

  role_assignment_enabled = true
  role_assignments        = var.sub_role_assignments

  subscription_tags = var.sub_tags
}

Expected Output

  # module.lz_vending_subscription.module.subscription[0].azurerm_subscription.this[0] will be imported
    resource "azurerm_subscription" "this" {
        alias             = "my-sub-123"
        billing_scope_id  = "***"
        id                = "/providers/Microsoft.Subscription/aliases/my-sub-123"
        subscription_id   = "***"
        subscription_name = "My Subscription 123"
        tags              = {
            "Key1"    = "Value1"
            "Key2"    = "Value2"
        }
        tenant_id         = "***"
        workload          = "Production"

Actual Output

  # module.lz_vending_subscription.module.subscription[0].azurerm_subscription.this[0] must be replaced
  # (imported from "/providers/Microsoft.Subscription/aliases/my-sub-123")
  # Warning: this will destroy the imported resource
-/+ resource "azurerm_subscription" "this" {
        alias             = "my-sub-123"
      + billing_scope_id  = "***"
      ~ id                = "/providers/Microsoft.Subscription/aliases/my-sub-123" -> (known after apply)
      ~ subscription_id   = "***" -> (known after apply)
        subscription_name = "My Subscription 123"
      ~ tags              = {
          + "Key1"    = "Value1"
          + "Key2"    = "Value2"
        }
      ~ tenant_id         = "***" -> (known after apply)
      + workload          = "Production" # forces replacement

Additional context

matt-FFFFFF commented 2 months ago

Hi,

I do not believe that this is a bug in this module as it is behaving as designed. Import of existing subscriptions is not a use case that we have considered or tested against. As you mention, we support existing subscription IDs for MG membership and deploying resources, e.g. vnets.

Have you tried using the subscription_use_AzAPI toggle? This will support rename and tagging.

If that doesn't work - this module uses azurerm_subscription from the AzureRM provider so I think this issue should be raised there as there is nothing we can do in this module.

If the API is giving inconsistent results then this might make it challenging to fix and could be one to take back to support for the attention of the resource provider team.

Let me know how you get on

pho-enix commented 2 months ago

@csmith66 I managed to successfully import subscriptions the same way you are trying to do. The issue is, that the Subscription Alias resource has a workload field, which can even be written via the API. Nevertheless it will always be return null and thus any imported subscription would be destroyed.

Workaround on my side was to ignore that particular field: https://github.com/Azure/terraform-azurerm-lz-vending/blob/097ae5a0227c958d929ae545c6b17a6336e111b8/modules/subscription/main.tf#L2

+  lifecycle {
+   ignore_changes = [
+      workload,
+    ]
+  }
matt-FFFFFF commented 2 months ago

Hi!

If you use AzAPI then don't import the alias resource as it's largely pointless after creation. You can even let it create one for an existing subscription and it'll have no real effect.

The module has features that allow it to manage the subscription tags and name using azapi requests. I hope that these can fit your use case.

csmith66 commented 2 months ago

As @pho-enix mentioned, modifying this module to ignore changes to the workload argument of the azurerm_subscription resource would be a low-effort bug fix.

+  lifecycle {
+   ignore_changes = [
+      workload,
+    ]
+  }

I tested this change on a local copy of the module as well and it eliminated the forced replace, allowing successful import of the existing subscription.

@matt-FFFFFF, thanks for some prompt feedback.

I do not believe that this is a bug in this module as it is behaving as designed. Import of existing subscriptions is not a use case that we have considered or tested against. As you mention, we support existing subscription IDs for MG membership and deploying resources, e.g. vnets.

This is incredibly surprising that import of existing subscriptions has not been a considered use case. I would suggest it is a highly important use case specifically applicable to this module.

Have you tried using the subscription_use_AzAPI toggle? This will support rename and tagging.

I had not tried it previously, but here are the results of trying a few tests.

Suggested Change

Enable the AzAPI toggle.

module "lz_vending_subscription" {
  ...
  subscription_use_AzAPI = true
}

Suggested Change Result

# module.azure_landing_zone_subscription.module.subscription[0].azapi_resource.subscription[0] will be created
  + resource "azapi_resource" "subscription" {
      + body                      = jsonencode(
            {
              + properties = {
                  + additionalProperties = {
                      + managementGroupId = null
                      + tags              = {
                          + Key1 = "Value1"
                          + Key2 = "Value2"
                        }
                    }
                  + billingScope         = "***"
                  + displayName          = "My Subscription 123"
                  + workload             = "Production"
                }
            }
        )
      + id                        = (known after apply)
      + ignore_casing             = false
      + ignore_missing_property   = true
      + name                      = "my-sub-123"
      + output                    = (known after apply)
      + parent_id                 = "/"
      + removing_special_chars    = false
      + response_export_values    = [
          + "properties.subscriptionId",
        ]
      + schema_validation_enabled = true
      + type                      = "Microsoft.Subscription/aliases@2021-10-01"
    }

  # module.azure_landing_zone_subscription.module.subscription[0].azapi_resource_action.subscription_cancel[0] will be created
  + resource "azapi_resource_action" "subscription_cancel" {
      + action      = "providers/Microsoft.Subscription/cancel"
      + id          = (known after apply)
      + method      = "POST"
      + output      = (known after apply)
      + resource_id = (known after apply)
      + type        = "Microsoft.Resources/subscriptions@2021-10-01"
      + when        = "destroy"
    }

  # module.azure_landing_zone_subscription.module.subscription[0].azapi_resource_action.subscription_rename[0] will be created
  + resource "azapi_resource_action" "subscription_rename" {
      + action      = "providers/Microsoft.Subscription/rename"
      + body        = jsonencode(
            {
              + subscriptionName = "My Subscription 123"
            }
        )
      + id          = (known after apply)
      + method      = "POST"
      + output      = (known after apply)
      + resource_id = (known after apply)
      + type        = "Microsoft.Resources/subscriptions@2021-10-01"
      + when        = "apply"
    }

This plan result is very concerning because it creates a new azapi_resource_action.subscription_cancel action. The plan does not show order of operations, but I would have to assume it is going to try to cancel the existing subscription and recreate it. This is basically the same issue as the azurerm provider resource being used in the module. This re-enforces my view this is a bug with this module, although there are upstream bugs that probably also need to be worked.

If you use AzAPI then don't import the alias resource as it's largely pointless after creation. You can even let it create one for an existing subscription and it'll have no real effect.

I think I agree on this point about the import, as the create alias operation should be idempotent if I remember correctly. Whether it creates a "new" subscription or imports an existing subscription it will still result in an azapi_resource instance in the state file.

Incremental Change

Explicitly pass the subscription Id of an existing resource.

module "lz_vending_subscription" {
  ...
  subscription_use_AzAPI = true
  subscription_id = "***"
}

Incremental Testing Result

No plan actions to create, rename, or cancel the subscription. This seems in line with the expected result. However, the below documentation gives pause to whether this is in line with intended use or it will cause other issues.

https://github.com/Azure/terraform-azurerm-lz-vending/blob/097ae5a0227c958d929ae545c6b17a6336e111b8/variables.subscription.tf#L114-L128

I am sure there are probably others who do both, but as previously stated, the explicit intent here is to have this module vend/manage the subscriptions and not the Management Group associations.

matt-FFFFFF commented 2 months ago

Hi @csmith66

This is incredibly surprising that import of existing subscriptions has not been a considered use case. I would suggest it is a highly important use case specifically applicable to this module.

The primary use case for this module is to simplify the creation of subscriptions at scale - hence we have not encountered this before and it seems to be an edge case.

Adding lifecycle block

We will consider the effect of this:

lifecycle {
  ignore_changes = [
    workload,
  ]
}

The alias resource

The problem is that the alias resource is awfully strange. It is a proxy to create a subscription, OR it can be created for an existing subscription (though the benefits of doing so are no clear). Once created, changes to properties.aditionalProperties are not passed onto the subscription that has been created and they are effectively read-only. You can also have multiple aliases pointing at one subscription (again, no idea why you'd want to do this).

Deleting the alias does NOT cancel the subscription. However azurerm tries to be helpful and do this for you. Unless you are managing all resources within that subscription in the same plan, it is highly likely that this cancel operation will fail.

Using AzAPI

We introduced an alternative way of managing the subscription alias resource for customers that had no access to the default management group - azurerm creates the sub in default mg and then moves it, this is problematic in some organizations.

The use of this...

module "lz_vending_subscription" {
  # ...
  subscription_use_AzAPI = true
}

Produces an expected result. Do not be concerned, the subscription cancel operation has when = "destroy" set, which means it will only fire on destroy and not on create. We have tried to mimic all of the functionality of the azurerm_subscription resource using azapi.

In the case that you had the subscription_use_AzAPI = true enabled, you would be able to import the existing alias (if there was one):

import {
  to = module.azure_landing_zone_subscription.module.subscription[0].azapi_resource.subscription[0]
  id = "/providers/Microsoft.Subscription/aliases/foo"
}

If there is no alias for the existing subscription then things are more tricky. If all you want to do is manage tags and display name then I suggest copying the azapi_update_resource (tags) and azapi_resource_action (rename) from this module and implementing yourself.

Again, managing naming and tags for existing subscriptions is not something we have expressly designed for. We only manage MG membership and deploy resources through the other sub-modules.

Thanks for following this slightly lengthy response.

kevinmccurdybrd commented 2 months ago

@matt-FFFFFF I can give you an example of where it isn't an edge case: We're an organization where we used to be on a legacy EA but moved our Azure spend to a CSP, which means we no longer can create our own subscriptions. In this case the CSP partner creates our subscription and then we assume control of it with the LZVM to manage other aspects, as well as our EA created subscriptions. We may move back to an MSA at some point in the future (I don't love the CSP thing TBH). That said, I understand that the primary intent of this module is for scale automation, but there are cases where managing outside previously created subs is relevant.

Edit: I gave this a read again with more coffee, and I think I missed a key detail above that affects relevance of my comment, however I'll leave this comment as an example of edge use-cases.

csmith66 commented 2 months ago

@kevinmccurdybrd, I think your comment is relevant, as this whole migrating between billing account types is partially what prompted the changes that ultimately lead to this request.

@matt-FFFFFF,

The primary use case for this module is to simplify the creation of subscriptions at scale - hence we have not encountered this before and it seems to be an edge case.

I guess if the main use case is only the creation step at scale and not subscription lifecycle management at scale, that makes sense. For this implementation, it is an end-to-end "vending" experience at scale. The identity that creates the subscription manages the entire subscription resource lifecycle, including cancellation (and removal of any lingering resources that might prevent cancelling).

We introduced an alternative way of managing the subscription alias resource for customers that had no access to the default management group - azurerm creates the sub in default mg and then moves it, this is problematic in some organizations.

That is understandable. In this case, the default MG is the desired location and it grants all the appropriate access required.

Do not be concerned, the subscription cancel operation has when = "destroy" set, which means it will only fire on destroy and not on create. We have tried to mimic all of the functionality of the azurerm_subscription resource using azapi.

I guess I missed that qualifier in the output. I probably would want to do some additional validation on a sandbox subscription.

For the immediate import issue, I am just going to tweak a local copy of the module with the suggested update by @pho-enix. Later I will probably do an more thorough impact assessment and conversion over to the AzAPI functionality. One change at a time though.

I will leave it up to you on how you want to proceed, but I think it would be good to have import parity whether we are using the azurerm or azapi configuration.

matt-FFFFFF commented 2 months ago

@csmith66

Would it be feasible that we allowed a null value to workload to support import?

I think this could be a good compromise?

matt-FFFFFF commented 2 months ago

@kevinmccurdybrd thank you. This context really helps.

matt-FFFFFF commented 2 months ago

@csmith66 regarding the cancel operation, please be reassured by the provider documentation:

https://registry.terraform.io/providers/Azure/azapi/latest/docs/resources/azapi_resource_action#when

An inspection of the code also reveals this to be the case:

https://github.com/Azure/terraform-provider-azapi/blob/a5398c1df0c6b8a91dd4416ce1c1b9e4ba6baa4b/internal/services/azapi_resource_action_resource.go#L206-L208

kevinmccurdybrd commented 2 months ago

@matt-FFFFFF I can start a separate issue for this if it's needed, but the other key functionality I'd like to have with this is where I can use a single set of YAML files to define the subs, and the ones with a subscription_id target that premade subscription, else create sub. I'm not sure this is possible right now currently, though I am admittedly "intermediate" in this regard. We used to use YAML files to do all our EA subs, but if I'm not mistaken as soon as I add subscription_id to the YAML and update the module config to expect "subscription_id" that it's going to screw up all the existing module created subscriptions?

Again, happy to move this to another issue, just sorta related to this "existing sub" management functionality.

matt-FFFFFF commented 2 months ago

@kevinmccurdybrd

You can do this now, for the existing subscription id, just set that to null in the instances where you have vended a subscription - either that or use lookup and set the default to null in that function.

Specifying null is the same as not specifying the value at all.

kevinmccurdybrd commented 2 months ago

Thanks Matt, I'll bug out now, all the best, and thanks for all the work on these modules. So helpful!

csmith66 commented 2 months ago

Would it be feasible that we allowed a null value to workload to support import?

I think this could be a good compromise?

If you can get it to work, I am open to alternative implementations. I think this might require a functionality change to the azurerm_subscription resource though.

matt-FFFFFF commented 2 months ago

@csmith66

It seems workload is optional:

https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/subscription#workload

Hence null would ignore.

luke-taylor commented 2 months ago

@csmith66 For this specific use case it seems like setting subscription_update_existing = true would be a happy compromise here. This in conjunction with setting the subscription_id will allow you to update the subscription_tags and subscription_display_name arguments of the existing subscription.