hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.58k stars 4.62k forks source link

Terraform doesn't dissociate NIC from VM before deletion and deletion fails #8105

Closed brandonh-msft closed 4 years ago

brandonh-msft commented 4 years ago

Community Note

Terraform (and AzureRM Provider) Version

tf 0.13.0 azurerm 2.22.0

Affected Resource(s)

Expected Behavior

Similar to #2566, Terraform should

Actual Behavior

It simply tries to delete the NI from the VM and errors out:

Error: Error deleting Network Interface "k6-nic-7" (Resource Group "sam-load-testing-sender"): network.InterfacesClient#Delete: Failure sending request: StatusCode=400 -- Original Error: Code="NicInUse" Message="Network Interface /subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/networkInterfaces/nic-7 is used by existing resource /subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Compute/virtualMachines/vm1. In order to delete the network interface, it must be dissociated from the resource. To learn more, see aka.ms/deletenic." Details=[]

Steps to Reproduce

  1. terraform apply --var number_of_vms=2 --var nics_per_vm=1
  2. terraform apply --var number_of_vms=1 --var nics_per_vm=1

Terraform configuration

variable "vm-size" {
  type        = string
  description = "Preferred VM Size"
  default     = "Standard_E8_v3"
}

variable "number_of_vms" {
  type        = number
  description = "Number of VMs to create"
}

variable "nics_per_vm" {
  type        = number
  description = "Number of NICs to attach to each created VM"
}

resource "azurerm_resource_group" "rg" {
  name     = "myrg"
  location = "westus"
}

resource "azurerm_virtual_network" "vm_network" {
  name                = "my_network"
  address_space       = ["10.0.0.0/16"]
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
}

resource "azurerm_subnet" "vm_subnet" {
  name                 = "internal"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vm_network.name
  address_prefixes     = ["10.0.2.0/24"]
}

resource "azurerm_public_ip" "pip" {
  count               = (var.number_of_vms * var.nics_per_vm)
  name                = "pip-${count.index}"
  resource_group_name = azurerm_resource_group.rg.name
  location            = azurerm_resource_group.rg.location
  allocation_method   = "Dynamic"
}

resource "azurerm_network_interface" "sender_ni" {
  count               = (var.number_of_vms * var.nics_per_vm)
  name                = "my-nic-${count.index}"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name

  enable_accelerated_networking = true

  ip_configuration {
    name                          = "internal"
    subnet_id                     = azurerm_subnet.vm_subnet.id
    private_ip_address_allocation = "Dynamic"
    public_ip_address_id          = azurerm_public_ip.pip[count.index].id
  }
}

resource "azurerm_linux_virtual_machine" "vm" {
  count = var.number_of_vms

  name                = "myvm.${count.index}"
  resource_group_name = azurerm_resource_group.rg.name
  location            = azurerm_resource_group.rg.location
  size                = var.vm-size
  admin_username      = "adminuser"

  network_interface_ids = slice(azurerm_network_interface.ni[*].id, var.nics_per_vm * count.index, (var.nics_per_vm * count.index) + var.nics_per_vm)

  admin_ssh_key {
    username   = "systemadmin"
    public_key = data.azurerm_key_vault_secret.ssh_public_key.value
  }

  os_disk {
    caching              = "ReadWrite"
    storage_account_type = "Standard_LRS"
  }

  source_image_reference {
    publisher = "Canonical"
    offer     = "UbuntuServer"
    sku       = "18.04-LTS"
    version   = "latest"
  }
}
tombuildsstuff commented 4 years ago

@brandonh-msft so that we can confirm this as a bug, can you provide a Terraform Configuration for this?

brandonh-msft commented 4 years ago

done.

magodo commented 4 years ago

Hi @brandonh-msft

I have tried to reproduce the issue with config based on the one you provided above, using provider v2.22.0. There is no issues on deletion. (I'm using the default values for the input variables)

Would you mind to provide more context (e.g. the variables used) and also the debug output to aid us further investigate on this?

brandonh-msft commented 4 years ago

Did you try setting number_of_vms to 2 and then run apply again with it set to 1 (so it has to delete one of the VMs)? I'll add this to my repro steps.

magodo commented 4 years ago

@brandonh-msft I just tried that and no error in my case.

tombuildsstuff commented 4 years ago

hi @brandonh-msft

Taking a look at the error being returned here:

Network Interface /subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/networkInterfaces/nic-7 is used by existing resource /subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Compute/virtualMachines/vm1

The issue is that when multiple VM's/NIC's are being provisioned - the ordering of the VM's/NIC's don't match, so in this instance NIC 7 is attached to VM 1 - thus in this instance trying to scale this down to 6 will fail (as you've mentioned) with an error that the NIC is in use (since VM7 will be deleted, but NIC7 cannot).

This issue will be coming from the Terraform Configuration, where the resulting slice ultimately isn't ordered correctly:

network_interface_ids = slice(azurerm_network_interface.ni[].id, var.nics_per_vm count.index, (var.nics_per_vm * count.index) + var.nics_per_vm)

I believe updating that to:

network_interface_ids = slice(azurerm_network_interface.ni..id, var.nics_per_vm count.index, (var.nics_per_vm * count.index) + var.nics_per_vm)

will make that ordering consistent here. One other possible way to workaround this would be to add tags to the NIC's and then use a Data Source to look those up based on the tags: e.g.

locals {
  number_of_nics = var.number_of_machines * var.number_of_nics_per_vm
}

resource "azurerm_network_interface" "test" {
  count = locals.number_of_nics
  // ...

  tags = {
    VMName = "VM${var.number_of_nics/count.index}"
  }
}

# note: this doesn't exist today, but we should add it
data "azurerm_network_interfaces" "test" {
  count = var.number_of_machines
  filter {
    tag {
      VMName = "VM${count.index}"
    }
  }
  depends_on = ["azurerm_network_interface.test"]
}

resource "azurerm_linux_virtual_machine" "test" {
  count = var.number_of_machines
  name = "VM${count.index}"
  network_interface_ids = element(data.azurerm_network_interfaces.test.network_interfaces.*.id, count.index)
  // ...
}

Ultimately this is a issue/question regarding Terraform Configuration - rather than an issue specific to the Azure Provider - as such I'm going to close this issue for the moment but would you mind re-opening this issue on the Terraform Core repository where someone from the Terraform Core Team should be able to assist further.

Thanks!

brandonh-msft commented 4 years ago

Thanks I'll give the change from [*] to .*. a shot and see if it helps.

TechArtistG commented 4 years ago

I get a similar error with Public IPs and Gateways

ghost commented 4 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!