Azure / terraform-azurerm-avm-ptn-alz

Terraform module to deploy Azure Landing Zones
https://registry.terraform.io/modules/Azure/avm-ptn-alz/azurerm
MIT License
62 stars 15 forks source link

Path to v1.0.0 #26

Closed matt-FFFFFF closed 2 months ago

matt-FFFFFF commented 7 months ago

We are excited to announce the pathway we have to make the module and associated provider v1.0.0. We have three features that we want to deliver, which are described below:

Note, as usual with early versions these changes will result in some breaking changes, typically in terms of the module/provider variable input schema.

The key themes of our journey to v1.0.0 are, further modularity, flexibility and convenience.


Decouple policies from module and provider

Complete āœ…

We plan to separate the built-in ALZ library from its current home in Azure/alzlib, to a dedicated location.

The provider will refer to this location, with a specific tag for the version of the policies that we want to use.

E.g.

provider "alz" {
  # ...
  alz_lib_ref = "platform/alz/2024.03.00"
}

This will download the policies from the remote location and cache them locally, at the moment this is the .alzlib directory, which should be excluded from source control just like the .terraform directory.

We will use go-getter to do this, this is the same method as the terraform binary uses to download modules!

This will allow us to decouple the data from the logic, this easing maintenance and providing more flexibility for you as the consumer.

Use without access to remote libraries

We recognise that some customers do not have complete access to the public internet. To support this we will continue to support local library locations:

provider "alz" {
  # ...
  use_alz_lib = false
  lib_urls    = [
    "${path.root}/lib/alz"
  ]
}

Default Policy Assignment Values

In ALZ we have the concept of re-using the same default value of a policy in multiple assignments. We simplify this to simplify consumption.

We will introduce the concept of a library file that will store the mapping of these default values to specific policy assignment parameters. The provider will read these values, then combine them from user-specified values to create the actual assignment.

{
  "defaults": [
    {
      "defaultName": "primaryLocation",
      "policyAssignments": [
        {
          "policyAssignmentName": "Deploy-Log-Analytics",
          "parameterNames": [
            "automationRegion",
            "workspaceRegion"
          ]
        },
        {
          "policyAssignmentName": "Deploy-MDFC-Config",
          "parameterNames": [
            "ascExportResourceGroupLocation"
          ]
        }
      ]
    }
  ]
}

This will allow us again to decouple the data from the logic, with all the same benefits as above.


Archetype composition

Complete āœ…

There will be one place to configure one thing:

Archetype definitions in library

We will use the library/libraries to define archetypes. It will not be possible to add/remove policy definitions, policy sets, policy assignments inside the HCL.

We will introduce the concept of a delta archetype, which will allow us to specify differences to create a new deployable archetype, e.g. alz-root + x, y, z policy definitions/assignments.

Policy assignment configuration in HCL

The module/provider inputs will be constrained to only modifying policy assignments. These are usually values required in policy parameters that can only come from terraform at run-time, therefore it makes sense to use HCL for this.


for_each & unknown values

Finally the new unknown_instances language experiment that debuted in the terraform 1.8-alpha releases will hugely simplify this module. We excitedly await the conclusion of this experiment.


Feedback!

We would love your feedback in the issue comments below, if you want to supply private feedback then my LinkedIn DMs are open: https://www.linkedin.com/in/matt-ffffff/

phx-tim-butters commented 7 months ago

Watching closely! Am working on preferring this AVM pattern over "Azure/caf-enterprise-scale/azurerm", which is what we're using the for the governance/structure of LZ deployments.

Can see some glaring differences at the minute.

I'm getting an unexpected attribute when trying to address lib_urls in the alz provider at the root.

matt-FFFFFF commented 7 months ago

Watching closely! Am working on preferring this AVM pattern over "Azure/caf-enterprise-scale/azurerm", which is what we're using the for the governance/structure of LZ deployments.

Can see some glaring differences at the minute.

  • I haven't seen any references to the policy structure being used. Am assuming its https://github.com/Azure/Enterprise-Scale/blob/main/docs/wiki/ALZ-Policies.md ?

  • At the moment, I don't see a way of defining a local lib folder for the purpose of customized definitions and assignments. There's no option to specify in the root module - terraform-azurerm-avm-ptn-alz - nor can I define the alz provider at the root and pass down the provider.

I'm getting an unexpected attribute when trying to address lib_urls in the alz provider at the root.

Hi!

Thanks for reaching out šŸ˜Š

Yes the policies are the same as in the Enterprise-Scale repo. We haven't yet introduced the sync process so they aren't quite as up to date as upstream.

As for the lib_urls: the provider has been updated to support this. You should declare any provider config in your root module, which should also constrain the provider version to ~> 0.8.

After that you should be able to include your own custom stuff, or override any of the built ins using allow_lib_overwrite.

At the moment there is no way to provide a delta to an archetype, so you must copy the definition and make changes. This will change in the future.

Does that make sense?

phx-tim-butters commented 7 months ago

At the moment there is no way to provide a delta to an archetype, so you must copy the definition and make changes. This will change in the future.

Makes perfect sense. I see what you're trying to achieve with versioning the ALZ policies also in the prospective 1.0.0 - very useful.

As for the lib_urls: the provider has been updated to support this. You should declare any provider config in your root module, which should also constrain the provider version to ~> 0.8.

After that you should be able to include your own custom stuff, or override any of the built ins using allow_lib_overwrite.

I'm getting an error when using v0.8.0 of Alz, and specifying the lib_url

  1. Intellisense is still referencing v0.6.3 for some reason - and so i'm getting a visual error in vscode.
    • Unexpected Attribute for lib_urls.

Tried with both the stable and pre-release Hashicorp Terraform extension.

  1. Error when running a Plan or Apply

    alz = { source = "Azure/alz" version = "0.8.0" }

provider "alz" { lib_urls = [ "${path.root}/lib" ] }

ā•· ā”‚ Error: Failed to download libraries ā”‚ ā”‚ with provider["registry.terraform.io/azure/alz"], ā”‚ on terraform.tf line 28, in provider "alz": ā”‚ 28: provider "alz" { ā”‚ ā”‚ relative paths require a module with a pwd ā•µ

Currently running TF 1.7.4 and language server 0.32.7.

Knocking back to 0.6.3 doesn't have an issue, but obviously I have to put back the older lib_dirs setting instead of.

Sorry - I know this is a post! Would you prefer a seperate issue raised?

matt-FFFFFF commented 7 months ago

Yes please raise an issue @phx-tim-butters

kewalaka commented 6 months ago

hi @matt-FFFFFF - is the intention to be able to specify adjustments to Policy via the AVM pattern module? At the moment (if i'm following along) - this is available if you make your own LZ pattern & pass in lib_overwrite_enabled to the provider config.

Also - where do you see some of the existing components in the Enterprise CAF sitting - e.g. the configuration of the private DNS zones (AVM module?), and Defender (the latter I guess most would do via Policy).

At the moment I'm using both the new ALZ module & the existing one with various feature flags disabled to get coverage for an LZ, just checking there isn't another module lurking elsewhere I have missed!

I know it's still early development but it would be great if there were a couple examples illustrating common patterns - e.g. overriding a particular policy or assigning an initiative such as the Azure Sec. Benchmark.

matt-FFFFFF commented 6 months ago

Hi @kewalaka

I'd recommend that you can modify existing policy assignments using the new policy_assignments_to_modify input. You only need to specify the sections you want to change and it will merge in the changes.

You are also able to duplicate the policyassignment*.json and use the lib_overwrite_enabled flag, however.

As for other components, we have AVM modules for VWAN, as you know. The hub & spoke one is being refactored. We also have modules for vnet gateways and ALZ management resources.

For private DNS zones we have a task on our backlog to produce this module.

We do not currently have anything to support defender configuration, but this is a good idea and we will consider it. We have observed race conditions where the policy does not apply if a subscription is moved into an applicable MG too quickly.

matt-FFFFFF commented 6 months ago

As for documentation, we have one more feature to implement (above) and we will then work on creating documentation and examples.

kewalaka commented 6 months ago

Hi @matt-FFFFFF that's not quite what I was meaning. I understand I can fork the pattern module and adjust the provider config to pass in the lib_overwrite_enabled, but if I am consuming the ptn module there is no way to pass this in (in terms of implementation, can you even put vars in provider blocks?).

Interesting to know about the defender race issues. Yes, I'm using the hub "AVM" and have noted alignment issues!

kewalaka commented 6 months ago

Oh hang on, I can just define the provider block in the root module and overrride it there.... D'oh

matt-FFFFFF commented 6 months ago

Oh hang on, I can just define the provider block in the root module and overrride it there.... D'oh

Yep, that's the way we'd recommend šŸ‘

matt-FFFFFF commented 6 months ago

See new example here: https://registry.terraform.io/modules/Azure/avm-ptn-alz/azurerm/latest/examples/policy-assignment-modification-with-custom-lib

SteveBurkettNZ commented 6 months ago

As for documentation, we have one more feature to implement (above) and we will then work on creating documentation and examples.

A walk-through on how we might use this to apply Azure Policy's to an existing (brownfields) but ALZ-aligned management group topology would be fantastic @matt-FFFFFF.

Think a bunch of corps out there have the standard Root, Platform (Connectivity/Management/Identity) and Landing Zones (Corp/Online) structure that they've deployed themselves but haven't yet got many (any?) Azure Policy in play.

Is there a way to supress the management group deployment and just point the module to the existing management groups for policy assignments?

kewalaka commented 6 months ago

Hi @SteveBurkettNZ, I'm assuming the intent is to bring the management groups within the control of terraform, in which case you could use the import block to accomplish this 'as code' without having the painful terraform import cli requirement .

The other thing to consider is if you want to use alz to manage policy. It's a valid approach. Another is to use EPAC, where you would deploy the empty archetype.

https://azure.github.io/enterprise-azure-policy-as-code/#who-should-use-epac

I will mention that epac is fairly complex and assumes a reasonable level of comfort with DevOps process to support

Let me know if expanding with an example is still useful.

SteveBurkettNZ commented 6 months ago

@kewalaka: Thanks for the reply. It's usually they've already deployed the management groups with Terraform but using their own code rather than ALZ. Maybe splitting the management groups and policy bits out in to two AVM PTN ALZ modules would allow easier use of the ALZ policy bits?

Yes, agree that EPAC is the alternative, need to revisit that one.

kewalaka commented 6 months ago

if feels like an antipattern to me to have the management group done in one piece of code and the related policies done in another (if you're wanting to manage it as part of your LZ deployment vs the EPAC alternative), keen to hear @matt-FFFFFF's opinion.

Using import would allow you to bring in the management group hierarchy into the AVM module without requiring a re-deploy.

it seems to me that the people who are responsible for policy would also want to be involved in the assignment of that policy to the appropriate level in the MG hierarchy, and thus logically own that, and decide when additional archetypes are required.

My other thought is how to test this safely, assuming you're going to be introducing this into a hierarchy with all your workloads subscriptions attached. An injudicious "remediate" could cause some pain, which is why sometimes these sort of tasks are done by setting up a new root management group and migrating workloads across.

Perhaps it's also an opportunity to encourage teams to leverage the great foundation Microsoft provide in this space too, rather than rolling their own.

matt-FFFFFF commented 6 months ago

Since MG and policy are so tightly coupled we would recommend that the same team manages both.

However we do come across the scenario where the top level ALZ MG is created on behalf of a team.

Given this, we need to support an existing MG in the module. Since the module is scopes to a MG, then this would also enable the scenario above,