hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.52k stars 4.6k forks source link

azurerm_ip_group_cidr is added to terraform state even though API call failed with IpGroupsUpdateFailed #27279

Open akselleirv opened 1 week ago

akselleirv commented 1 week ago

Is there an existing issue for this?

Community Note

Terraform Version

1.8.4

AzureRM Provider Version

4.0.1

Affected Resource(s)/Data Source(s)

azurerm_ip_group_cidr

Terraform Configuration Files

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "4.0.1"
    }
  }
}

provider "azurerm" {
  features {}
}

data "azurerm_ip_group" "main" {
  name                = var.values.ip_group_name
  resource_group_name = var.values.ip_group_resource_group_name
}

resource "azurerm_ip_group_cidr" "main" {
  for_each    = var.values.cidrs
  ip_group_id = data.azurerm_ip_group.main.id
  cidr        = each.key
}

variable "values" {
  type = object({
    ip_group_name                = string
    ip_group_resource_group_name = string
    cidrs                        = optional(set(string))
  })
}

Debug Output/Panic Output

Apply error: rpc error: code = Internal desc = exit status 1

Error: creating Ip Group Cidr: (Cidr Name "10.33.94.5" / Ip Group Name "<IPGroupName>" / Resource Group "<rg-redacted>"): polling after CreateOrUpdate: polling failed: the Azure API returned the following error:

Status: "IpGroupsUpdateFailed"
Code: ""
Message: "Put on IP Groups <IPGroupName> Failed with 1 faulted referenced firewalls"
Activity Id: ""

---

API Response:

----[start]----
{
  "error": {
    "code": "IpGroupsUpdateFailed",
    "message": "Put on IP Groups <IPGroupName> Failed with 1 faulted referenced firewalls"
  },
  "status": "Failed"
}
-----[end]-----

  with azurerm_ip_group_cidr.main["10.33.94.5"],
  on main.tf line 19, in resource "azurerm_ip_group_cidr" "main":
  19: resource "azurerm_ip_group_cidr" "main" {

# On the next run it fails with the following error:

error running Apply: rpc error: code = Internal desc = exit status 1

Error: A resource with the ID "/subscriptions/<sub-redacted>/resourceGroups/<rg-redacted>/providers/Microsoft.Network/ipGroups/<IPGroupName>/cidrs/10.33.94.5" already exists - to be managed via Terraform this resource needs to be imported into the State. Please see the resource documentation for "azurerm_ip_group_cidr" for more information.

  with azurerm_ip_group_cidr.main["10.33.94.5"],
  on main.tf line 19, in resource "azurerm_ip_group_cidr" "main":
  19: resource "azurerm_ip_group_cidr" "main" {

Expected Behaviour

It should be able to recover from the error.

Actual Behaviour

Even though the API returned an error that the PUT failed, it still somehow managed to add the IP address to the IP group which results in the provider trying to add an IP which already exists.

I assume it fails due to a known limitation in the firewall:

When you update two or more IP Groups attached to the same firewall, one of the resources goes into a failed state.

This is a known issue/limitation.

When you update an IP Group, it triggers an update on all firewalls that the IPGroup is attached to. If an update to a second IP Group is started while the firewall is still in the Updating state, then the IPGroup update fails.

To avoid the failure, IP Groups attached to the same firewall must be updated one at a time. Allow enough time between updates to allow the firewall to get out of the Updating state.

However, I'm not able to do any synchronization cross pipelines and subscriptions.

Steps to Reproduce

The IP group is used by an Azure firewall that is located in another pipeline which makes it difficult to reproduce the bug.

Important Factoids

No response

References

No response

neil-yechenwei commented 1 week ago

Thanks for raising this issue. TF would provision azurerm_ip_group_cidr in the parallel. When one service request failed, another one should succeed. So cidr would be added at both service side and TF side. It's by TF design. For your case, I assume you have to remove/change the duplicate cidr after it failed.