hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.46k stars 4.54k forks source link

Example of using the Subnet Association resources with Azure Policy #9022

Open tombuildsstuff opened 3 years ago

tombuildsstuff commented 3 years ago

Community Note

Description

Larger organizations utilize Azure Policies to ensure that a Subnets contain a Network Security Group/Route Table ID at creation time, which is incompatible with the azurerm_subnet_network_security_group_association and azurerm_subnet_route_table_association resources - which are required to be able to workaround issues in the Azure API during terraform destroy.

Unfortunately these issues are unavoidable due to the design of the Azure API - and whilst we could seek to reintroduce these two fields - ultimately this'd reintroduce the issue and so isn't a viable solution.

These "association" resources create an empty Subnet and then subsequently patch them to add the network_security_group_id and the route_table_id fields - but crucially by having these split out allow operations to be ordered such that these destroy operations aren't possible.

One option to work around this is to use the azurerm_virtual_network resource to define the subnets with these fields inline - however this only works when these subnets are defined centrally and so doesn't work for all users/scenarios.

As such this issue covers adding an example Azure Policy to achieve this scenario, namely:

This Policy should allow organizations to comply with Policy requirements (requiring that all Subnets have a Network Security Group/Route Table ID) whilst working around the issues in the Azure API.

New or Affected Resource(s)

References

XavierGeerinck commented 3 years ago

I am also having this issue.... it's quite blocking as well here.

I now create a VNET through this code:

resource "azurerm_virtual_network" "vnet_hub" {
  name                = "vnet-hub"
  location            = azurerm_resource_group.rg_connectivity_hub.location
  resource_group_name = azurerm_resource_group.rg_connectivity_hub.name
  address_space       = ["10.0.0.0/16"]

  tags = {
    environment = "global"
  }

  subnet {
    name           = "GatewaySubnet" # Required name, do not change
    address_prefix = "10.0.255.224/27"
    security_group = azurerm_network_security_group.nsg_hub_gateway.id
  }

  subnet {
    name           = "s-hub-mgmt"
    address_prefix = "10.0.0.64/27"
    security_group = azurerm_network_security_group.nsg_hub_mgmt.id
  }

  subnet {
    name           = "s-hub-dmz"
    address_prefix = "10.0.0.32/27"
    security_group = azurerm_network_security_group.nsg_hub_dmz.id
  }
}

However the main issue is that while I expect this to create the subnets, I am not able to fetch the IDs of the subnet (even though they should get exported as stated by https://github.com/terraform-providers/terraform-provider-azurerm/pull/1913). Checking into the code https://github.com/terraform-providers/terraform-provider-azurerm/blob/master/azurerm/internal/services/network/virtual_network_resource.go#L423 provides that I should be able to set subnet_id = values(azurerm_virtual_network.vnet_hub.subnet)[0].id to then initialize the gateway. This however is not working either and returns that the Set is empty.

XavierGeerinck commented 3 years ago

Update: Fixed I can deploy now with the policy, below the config I used:

# Network Security Groups (NSGs)
resource "azurerm_network_security_group" "nsg_hub_mgmt" {
  name                = "nsg-hub-mgmt"
  location            = azurerm_resource_group.rg_connectivity_hub.location
  resource_group_name = azurerm_resource_group.rg_connectivity_hub.name
}

resource "azurerm_network_security_group" "nsg_hub_dmz" {
  name                = "nsg-hub-dmz"
  location            = azurerm_resource_group.rg_connectivity_hub.location
  resource_group_name = azurerm_resource_group.rg_connectivity_hub.name
}

# Virtual Network (vNET)
resource "azurerm_virtual_network" "vnet_hub" {
  name                = "vnet-hub"
  location            = azurerm_resource_group.rg_connectivity_hub.location
  resource_group_name = azurerm_resource_group.rg_connectivity_hub.name
  address_space       = ["10.0.0.0/16"]

  tags = {
    environment = "global"
  }

  # Our Gateway Subnet, it cannot have a NSG!
  subnet {
    name           = "GatewaySubnet" # Required name, do not change
    address_prefix = "10.0.255.224/27"
  }

  subnet {
    name           = "s-hub-mgmt"
    address_prefix = "10.0.0.64/27"
    security_group = azurerm_network_security_group.nsg_hub_mgmt.id
  }

  subnet {
    name           = "s-hub-dmz"
    address_prefix = "10.0.0.32/27"
    security_group = azurerm_network_security_group.nsg_hub_dmz.id
  }
}
Asos-RiverPhillips commented 3 years ago

this only works when these subnets are defined centrally and so doesn't work for all users/scenarios

Do you have an example of how this can be achieved if the subnets aren't defined centrally? This is a real blocker for us.

dhensby commented 3 years ago

You have to change the azure policy if you aren't defining the vnets and subnets all at once.

courageDeveloper commented 3 years ago

@tombuildsstuff when you talk about "When attempting to create a Network Interface [IP Configuration] within the Subnet, require that a Network Security Group ID/Route Table ID exists" are you referring to the NSG created at the Network Interface or at the Subnet?

jtracey93 commented 3 years ago

Hi @courageDeveloper, @Asos-RiverPhillips & @XavierGeerinck,

You can reference the subnet attributes as part of the VNET resource block like the below:

output "subnet-id" {
  value = azurerm_virtual_network.vnet.subnet.*.id[0]
}

output "subnet-name" {
  value = azurerm_virtual_network.vnet.subnet.*.name[0]
}

Where "0" is the first subnet block defined, so GatewaySubnet in the above example.

Hope that helps

hobti01 commented 2 years ago

I don't see mentions of service_endpoints, but not PrivateLink Endpoints.

Inline subnet does not support attributes enforce_private_link_endpoint_network_policies or enforce_private_link_service_network_policies and with Azure Policies in place, inline is the only method to create subnets - therefore subnets cannot be created that support Private Link Services or Endpoints.

We're in an environment where we do not have the luxury of disabling or changing Azure Policies, so that approach is not an option open to us.

nela commented 2 years ago

I am facing the same problem as @hobti01. More detailed description here: https://discuss.hashicorp.com/t/circumvent-azure-policy-deny-subnet-without-nsg/32422

dhensby commented 2 years ago

As far as I'm aware there are two options:

  1. Create the subnet/nsg as an all-in-one resource in terraform
  2. Disable the Azure Policy

If both are unavailable to you then you're at a dead-end because I do not believe there is any other feasible way to do this

weisdd commented 2 years ago

@tombuildsstuff I've tried to play with inline NSG association by reintroducing the security_group to azurerm_subnet in https://github.com/weisdd/terraform-provider-azurerm/commit/8ae1cdb66a43e341ebeee1d798c3168a9503e206 , and haven't seen any errors during multiple apply-destroy manual tests. Could you, please, share an example of code from the past that would more or less consistently reproduce the issue you're referring to in the first message? (https://github.com/hashicorp/terraform-provider-azurerm/issues/9022#issue-729540778) Since it's been over 1.5 years, maybe something has improved on the Azure side. I'd love to test it all. Thanks!

(Upd): Here's the example I'm using:

provider "azurerm" {
  features {}
}

resource "azurerm_resource_group" "example" {
  name     = "example-resources"
  location = "West Europe"
}

resource "azurerm_virtual_network" "example" {
  name                = "example-network"
  address_space       = ["10.0.0.0/16"]
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name
}

resource "azurerm_subnet" "example" {
  name                 = "frontend"
  resource_group_name  = azurerm_resource_group.example.name
  virtual_network_name = azurerm_virtual_network.example.name
  address_prefixes     = ["10.0.2.0/24"]
  security_group       = azurerm_network_security_group.example.id
}

resource "azurerm_network_security_group" "example" {
  name                = "example-nsg"
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name

  security_rule {
    name                       = "test123"
    priority                   = 100
    direction                  = "Inbound"
    access                     = "Allow"
    protocol                   = "Tcp"
    source_port_range          = "*"
    destination_port_range     = "*"
    source_address_prefix      = "*"
    destination_address_prefix = "*"
  }
}
$ terraform apply
[...]
azurerm_resource_group.example: Creating...
azurerm_resource_group.example: Creation complete after 1s [id=/subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/example-resources]
azurerm_virtual_network.example: Creating...
azurerm_network_security_group.example: Creating...
azurerm_virtual_network.example: Creation complete after 4s [id=/subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/example-resources/providers/Microsoft.Network/virtualNetworks/example-network]
azurerm_network_security_group.example: Creation complete after 4s [id=/subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/example-resources/providers/Microsoft.Network/networkSecurityGroups/example-nsg]
azurerm_subnet.example: Creating...
azurerm_subnet.example: Creation complete after 4s [id=/subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/example-resources/providers/Microsoft.Network/virtualNetworks/example-network/subnets/frontend]

Apply complete! Resources: 4 added, 0 changed, 0 destroyed.

$ terraform destroy
[...]

azurerm_subnet.example: Destroying... [id=/subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/example-resources/providers/Microsoft.Network/virtualNetworks/example-network/subnets/frontend]
azurerm_subnet.example: Still destroying... [id=/subscriptions/64842ced-4781-416f-81ff-...works/example-network/subnets/frontend, 10s elapsed]
azurerm_subnet.example: Destruction complete after 11s
azurerm_virtual_network.example: Destroying... [id=/subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/example-resources/providers/Microsoft.Network/virtualNetworks/example-network]
azurerm_network_security_group.example: Destroying... [id=/subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/example-resources/providers/Microsoft.Network/networkSecurityGroups/example-nsg]
azurerm_virtual_network.example: Still destroying... [id=/subscriptions/64842ced-4781-416f-81ff-...etwork/virtualNetworks/example-network, 10s elapsed]
azurerm_network_security_group.example: Still destroying... [id=/subscriptions/64842ced-4781-416f-81ff-...work/networkSecurityGroups/example-nsg, 10s elapsed]
azurerm_virtual_network.example: Destruction complete after 11s
azurerm_network_security_group.example: Destruction complete after 11s
azurerm_resource_group.example: Destroying... [id=/subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/example-resources]
azurerm_resource_group.example: Still destroying... [id=/subscriptions/64842ced-4781-416f-81ff-...62581/resourceGroups/example-resources, 10s elapsed]
azurerm_resource_group.example: Destruction complete after 15s

Destroy complete! Resources: 4 destroyed.

Upd2: I saw your comment in https://github.com/hashicorp/terraform-provider-azurerm/issues/3653#issuecomment-501995196, which said:

"There's a bunch of reasons for this (mostly user experience related), but the main one is to work around a circular reference in Azure, where it's possible to create resources in any order; but on deletion you encounter an error where the Subnet deletion will fail unless the Network Security Group is detached first."

When I was doing my tests, there was a policy in place that would prohibit me to create a subnet without an NSG attached or to detach the NSG from the subnet after creation. Thus, it seems it's now possible to remove a subnet without detaching NSG first.

weisdd commented 1 year ago

@tombuildsstuff to make sure the things I observed (shared in the previous comment) are not temporary, I opened a ticket with Microsoft from our corporate account:

(Question): "[...] I've tried to reintroduce NSG reference back to the azurerm_subnet resource (https://github.com/weisdd/terraform-provider-azurerm/commit/8ae1cdb66a43e341ebeee1d798c3168a9503e206) and found out that now it's actually possible to delete a subnet without detaching NSG, so Hashicorp's complaint seems to be no longer relevant. Could you track, when did that change happen in the Azure API backend? Will that behaviour persist over time? [...]" (Answer): "I confirm that the logic that allow subnet deletion without detaching NSG or UDR is permanent."

Unfortunately, I'm not aware about the exact date when this change was introduced yet, still trying to request that info from Microsoft.

Given this information, would Hashicorp consider reintroducing the binding logic back to the subnet resource? :) I understand that you've been flooded with similar requests, though I haven't seen anyone saying that the Azure backend logic has got fixed at some point, thus everything is likely to work flawlessly now. I can prepare a PR with tests if needed.

Since we have contracts with both Hashicorp and Microsoft, we can try to organise an internal email thread with all parties involved if there is any concern.

Thanks!

tombuildsstuff commented 1 year ago

@weisdd unfortunately reintroducing these fields to the azurerm_subnet resource would end up reintroduce the (hard to debug, but common in complex configurations) issues which caused this - so we wouldn't accept a PR to reintroduce those fields.

If the Subnet's Delete API has changed it's behaviour then we should look to update the Delete method in the azurerm_subnet resource to account for this (checking and raising an error, if there's an association found) - but since that's a separate topic to this issue would you mind opening a new feature request for that?

Thanks!

sioakim commented 1 year ago

Our scenario is the same as what Tom mentions. A default UDR is applied via policy - so when I tried to apply the association I would get that a resource already exists and I had to import it.

After being on a call yesterday with @grayzu I was pointed to azapi_update provider to test out a possible solution. For my needs I call this success! I create my subnet and nsg association with azurerm provider and the UDR association with azapi_update provider:

resource "azapi_update_resource" "subnet-asp-udr" {
  type        = "Microsoft.Network/virtualNetworks/subnets@2020-11-01"
  resource_id = azurerm_subnet.asp.id
  depends_on = [
    azurerm_subnet.asp
  ]
  body = jsonencode({
    properties = {
      routeTable = {
        id = local.udrdefaultid
      }
    }
  })
}

I even tested it for configuration drift (I changed the UDR from the portal manually) and it was picked up by terraform plan.

Please add this to a list of workarounds.

weisdd commented 1 year ago

@sioakim I might misunderstand something, though how is that different from using azurerm_subnet_route_table_association directly? I think the call to modify the subnet properties is made after subnet creation, so if there's a blocking (Deny) policy in place, terraform will never execute azapi_update_resource, because it'll fail earlier, at the subnet creation step.

sioakim commented 1 year ago

@weisdd As per initial post: Larger organizations utilize Azure Policies to ensure that a Subnets contain a Network Security Group/Route Table ID at creation time

For me this translates to having a policy that will add the subnet association and/or the NSG association on creation. Note you can not specify the associations from Terraform on creation. You need to create them as a second step. The problem is that if the policy has added an association already your 2nd step in terraform (that is specifying the association) will fail. Our policy doesn’t block Update. It just wants to make sure that you don’t have a blank value for the UDR.

sagitiminsky commented 1 year ago

`resource "azapi_resource" "cosmos_vnet" {

issue placeholder

type      = "Microsoft.Network/virtualNetworks@2022-05-01"
name      = <>
parent_id = <resource_group_id>
location  = <>

depends_on = [ ] 

body = jsonencode({
    properties = {
        addressSpace = {
            addressPrefixes = <>
        }
        subnets = [
            {
                name = <>
                "properties" : {
                    addressPrefix        = <>
                    networkSecurityGroup = {
                        id = <azurerm_network_security_group.exmample.id>
                    }
                    routeTable = {
                        id = <azurerm_route_table.example.id>
                    }
                }
                type = "Microsoft.Network/virtualNetworks/subnets"
            },
            {
                name : <>,
                "properties" : {
                    addressPrefix        = <>
                    networkSecurityGroup = {
                        id = <azurerm_network_security_group.example.id>
                    }
                },
                type : "Microsoft.Network/virtualNetworks/subnets"
            }

        ]
    }
})

}`

This solution seems to be working for us, when policies are introduced. But the use of azapi_resource introduces a dependency between azurerm and azureapi. Did somebody find a solution that uses solely azurerm?

adeturner commented 1 year ago

UPDATE: apologies I made an error. The terraform issue still exists - however for the other two methods I was getting the policy violation because I was not using a specific route table (not mentioned in the error). When I switched to it the azapi update and azcli both worked


I also have this with azurerm_subnet + azurerm_subnet_route_table_association. I dont own the vnet, and tried az cli and azapi below. It appears there is no option but to manually create the subnet in the portal (which works)

az network vnet subnet create ... --route-table... fails.

as does

resource "azapi_update_resource" "add_bridge_subnet" {
  type        = "Microsoft.Network/virtualNetworks@2022-05-01"
  resource_id = data.azurerm_virtual_network.protected_vnet.id

  body = jsonencode({
    properties = {

        subnets = [
            {
                name = "${var.resource_prefix}-sn"
                "properties" : {
                    addressPrefix        = var.bridge_subnet_cidr
                    routeTable = {
                        id = data.azurerm_route_table.spoke_rt.id
                    }
                }
                type = "Microsoft.Network/virtualNetworks/subnets"
            }
        ]
    }
  })

  depends_on = [
    data.azurerm_route_table.spoke_rt,
    data.azurerm_virtual_network.protected_vnet,
  ]
}
β”‚   "error": {
β”‚     "code": "RequestDisallowedByPolicy",
β”‚     "message": "Resource N was disallowed by policy. Policy identifiers.........Enforce a RouteTable on every subnet....
weisdd commented 1 year ago

A bit of a side-note, but still related to the same policy that's being discussed here: AKS clusters with kubenet CNI (=managed vnet) can now be deleted (with no extra actions) despite the policy in place, it works for both Azure Portal and azurerm provider:

The current plan is to still detach the NSG before deleting the subnet (it handles some useful edge cases for us, so we don't want to remove it), but to make the deletion best-effort - if it does fail due to policy or some other reason, we'll proceed with cleaning everything up and deleting the node resource group. We'll log the error on our side for support but not return a failure state to you unless we're actually unable to delete everything.

https://github.com/Azure/AKS/issues/3111#issuecomment-1460254365

robertbrandso commented 1 year ago

If you're hit with this limitation and need to create subnets with configuration like delegation, service endpoint and private endpoint support you can't use the solution within azurerm_virtual_network resource.

I posted an example in the following issue on how to solve it using the azapi_resource: https://github.com/hashicorp/terraform-provider-azurerm/issues/3917#issuecomment-1550246929

Full documentation can be found here: Microsoft.Network virtualNetworks/subnets

thesse1 commented 1 year ago

Update: Fixed I can deploy now with the policy, below the config I used:

# Network Security Groups (NSGs)
resource "azurerm_network_security_group" "nsg_hub_mgmt" {
  name                = "nsg-hub-mgmt"
  location            = azurerm_resource_group.rg_connectivity_hub.location
  resource_group_name = azurerm_resource_group.rg_connectivity_hub.name
}

resource "azurerm_network_security_group" "nsg_hub_dmz" {
  name                = "nsg-hub-dmz"
  location            = azurerm_resource_group.rg_connectivity_hub.location
  resource_group_name = azurerm_resource_group.rg_connectivity_hub.name
}

# Virtual Network (vNET)
resource "azurerm_virtual_network" "vnet_hub" {
  name                = "vnet-hub"
  location            = azurerm_resource_group.rg_connectivity_hub.location
  resource_group_name = azurerm_resource_group.rg_connectivity_hub.name
  address_space       = ["10.0.0.0/16"]

  tags = {
    environment = "global"
  }

  # Our Gateway Subnet, it cannot have a NSG!
  subnet {
    name           = "GatewaySubnet" # Required name, do not change
    address_prefix = "10.0.255.224/27"
  }

  subnet {
    name           = "s-hub-mgmt"
    address_prefix = "10.0.0.64/27"
    security_group = azurerm_network_security_group.nsg_hub_mgmt.id
  }

  subnet {
    name           = "s-hub-dmz"
    address_prefix = "10.0.0.32/27"
    security_group = azurerm_network_security_group.nsg_hub_dmz.id
  }
}

This does not seem to work anymore (Terraform v1.5.1 on windows_386 + provider registry.terraform.io/hashicorp/azurerm v3.63.0). When I try to create a vnet with embedded subnet, I am getting the following error message:

β•· β”‚ Error: Incorrect attribute value type β”‚ β”‚ on dev-ops-agents.tf line 22, in resource "azurerm_virtual_network" "default": β”‚ 22: subnet = [ β”‚ 23: { β”‚ 24: address_prefix = "10.0.0.0/24" β”‚ 25: name = "default" β”‚ 26: security_group = azurerm_network_security_group.default.id β”‚ 27: } β”‚ 28: ] β”‚ β”‚ Inappropriate value for attribute "subnet": element 0: attribute "id" is required. β•΅

Here is the full code:

provider "azurerm" {
  features {}

  subscription_id = "${var.subscriptionId}"
}

resource "azurerm_resource_group" "default" {
  name     = "vmssagents"
  location = "West Europe"

  tags = {
    environment = "azuredevops"
  }
}

resource "azurerm_virtual_network" "default" {
  name                = "vmssagents-vnet"
  address_space       = ["10.0.0.0/16"]
  location            = azurerm_resource_group.default.location
  resource_group_name = azurerm_resource_group.default.name
  subnet                  = [
      {
          address_prefix = "10.0.0.0/24"
          name           = "default"
          security_group = azurerm_network_security_group.default.id
      }
  ]

  tags = {
    environment = "azuredevops"
  }
}

resource "azurerm_network_security_group" "default" {
  name                = "vmssagents-vnet-nsg"
  location            = azurerm_resource_group.default.location
  resource_group_name = azurerm_resource_group.default.name

  tags = {
    environment = "azuredevops"
  }
}

Why does it expect me to deliver the ID of a resource that I am asking it to create?

Please help. Thanks!

weisdd commented 1 year ago

@thesse1 you need to specify each subnet in a separate block whereas you're using an array:

  subnet                  = [
      {
          address_prefix = "10.0.0.0/24"
          name           = "default"
          security_group = azurerm_network_security_group.default.id
      }
  ]

=>

  subnet {
      address_prefix = "10.0.0.0/24"
      name           = "default"
      security_group = azurerm_network_security_group.default.id
  }
thesse1 commented 1 year ago

@weisdd Thanks a lot for the hint. Makes sense. Working fine!

haflidif commented 8 months ago

Here is another great workaround to mitigate the policy issue, I created a terraform module that uses the Azure/azapi resource to be able to create subnets that support delegation, service endpoints, as well as use both existing NSG and Route Tables, + is also able to create NSG and Route Table in one go,

It should help until an official resolution within azurerm is in place.

Terraform Registry: haflidif/alz-subnet/azurerm Github Repo: terraform-azurerm-alz-subnet

APEX-DUSAN-ZAKIC commented 1 month ago

Has the policy example been created for this issue? I use "Azure/caf-enterprise-scale/azurerm" for deploying LZ for Azure. Constantly disabling or adding exemptions on policy whenever an issue occurs like this is not really a solution...

papagalu commented 2 weeks ago

encountered the same issue, unless the subnet interface is fully replicated under the vnet resource, this will be an open issue. eg delegations