hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.51k stars 4.6k forks source link

azurerm_site_recovery_replicated_vm tries to recreate every time due to OS disk data source refresh #10683

Closed qubeio closed 1 year ago

qubeio commented 3 years ago

Community Note

Terraform (and AzureRM Provider) Version

Terraform v0.14.5 Azure RM 2.48

Affected Resource(s)

Terraform Configuration Files

locals {
  client_env_code      = "E"
  display_name         = format("%s%s%s%s", var.regional_unit_code, local.client_env_code, "WIIS", var.instance_num)
  subnet_id            = var.subnet_common_id
  dr_subnet_name       = var.dr_subnet_common_name
  resource_group_name  = var.rg_common_name
  dr_resource_group_id = var.dr_rg_common_id
  private_ip           = var.private_ip
  asg_list             = var.common_asg_id_list
  disk_type            = "StandardSSD_LRS"
}
data "azurerm_managed_disk" "boot_disk" {
  name                = azurerm_windows_virtual_machine.cmp.os_disk[0].name
  resource_group_name = local.resource_group_name
}
resource "azurerm_windows_virtual_machine" "cmp" {
  name                = local.display_name
  resource_group_name = local.resource_group_name
  location            = var.regional_unit_location
  size                = var.instance_size

  # Security
  admin_username = "ansible"
  admin_password = var.ansible_win_pass

  network_interface_ids = [
    azurerm_network_interface.cmp_vnic.id,
  ]

  os_disk {
    caching              = "ReadWrite"
    storage_account_type = local.disk_type
  }

  source_image_reference {
    publisher = "MicrosoftWindowsServer"
    offer     = "WindowsServer"
    sku       = "2019-Datacenter"
    version   = "latest"
  }

  tags = {
    # Provider Tags
    environment = var.operations_env
    ru          = var.regional_unit_location
    ru_code     = var.regional_unit_code
    region      = var.region
    provider    = "azure"

    # OS Tags
    os         = "windows"
    os_distro  = "server"
    os_version = "2019"

    # Terraform Tags
    terraform_managed = "true"
    instance          = local.display_name

    #Ansible Tags
    ansible_managed = "true"
    role            = "iis_engineering"
    client_env      = var.client_env
  }
}

################################################################################
# the Custom Script extension runs the publically available Ansible script
# that will configure a Windows VM correctly to accept secure WinRM over 
# port 5986. This is required for Ansible.
#
################################################################################

resource "azurerm_virtual_machine_extension" "secure_winrm" {
  name                       = format("%s%s", local.display_name, "_cse_winrm")
  virtual_machine_id         = azurerm_windows_virtual_machine.cmp.id
  publisher                  = "Microsoft.Compute"
  type                       = "CustomScriptExtension"
  type_handler_version       = "1.9"
  auto_upgrade_minor_version = "true"

  protected_settings = <<PROTECTED_SETTINGS
    {
      "commandToExecute": "powershell -ExecutionPolicy Unrestricted -File ConfigureRemotingForAnsible.ps1"
    }
  PROTECTED_SETTINGS

  settings = <<SETTINGS
    {
        "fileUris": [
          "https://raw.githubusercontent.com/ansible/ansible/devel/examples/scripts/ConfigureRemotingForAnsible.ps1"
        ]
    }
  SETTINGS
}

################################################################################
# NETWORKING
#
################################################################################

resource "azurerm_network_interface" "cmp_vnic" {
  name                = format("%s%s", local.display_name, "_vnic")
  location            = var.regional_unit_location
  resource_group_name = local.resource_group_name

  ip_configuration {
    name                          = "internal"
    subnet_id                     = local.subnet_id
    private_ip_address_allocation = local.private_ip == null ? "Dynamic" : "Static"
    private_ip_address            = local.private_ip
  }
}

resource "azurerm_network_interface_application_security_group_association" "asg_associations" {
  network_interface_id          = azurerm_network_interface.cmp_vnic.id
  for_each                      = local.asg_list
  application_security_group_id = each.value
}

################################################################################
# SITE RECOVERY SERVICES
# Replication
################################################################################

resource "azurerm_site_recovery_replicated_vm" "cmp_replication" {
  name                                      = local.display_name
  resource_group_name                       = var.site_recovery_resource_list.recovery_resources_resource_group
  recovery_vault_name                       = var.site_recovery_resource_list.recovery_vault_name
  source_recovery_fabric_name               = var.site_recovery_resource_list.primary_fabric_name
  source_vm_id                              = azurerm_windows_virtual_machine.cmp.id
  recovery_replication_policy_id            = var.site_recovery_resource_list.recovery_policy.id
  source_recovery_protection_container_name = var.site_recovery_resource_list.fabric_container_primary.name

  target_resource_group_id                = local.dr_resource_group_id
  target_recovery_fabric_id               = var.global_site_recovery_resource_list.site_recovery_fabrics_recovery[format("%s%s", lower(var.regional_unit_code), "-com-asr-fabric-recovery")].id
  target_recovery_protection_container_id = var.site_recovery_resource_list.fabric_container_recovery.id

  managed_disk {
    disk_id                    = lower(data.azurerm_managed_disk.boot_disk.id)
    staging_storage_account_id = var.site_recovery_resource_list.cache_storage_primary.id
    target_resource_group_id   = local.dr_resource_group_id
    target_disk_type           = local.disk_type
    target_replica_disk_type   = local.disk_type
  }

  network_interface {
    source_network_interface_id = azurerm_network_interface.cmp_vnic.id
    target_subnet_name          = local.dr_subnet_name
    target_static_ip            = var.private_ip
  }

  managed_disk {
    disk_id                    = lower(azurerm_managed_disk.cmp_disk1.id)
    staging_storage_account_id = var.site_recovery_resource_list.cache_storage_primary.id
    target_resource_group_id   = local.dr_resource_group_id
    # Confirm this is the same as the disk type defined in the other files
    target_disk_type         = "StandardSSD_LRS"
    target_replica_disk_type = "StandardSSD_LRS"
  }

  timeouts {
    create = "360m"
    delete = "360m"
  }
}

# Copy-paste your Terraform configurations here - for large Terraform configs,
# please use a service like Dropbox and share a link to the ZIP file. For
# security, you can also encrypt the files using our GPG public key: https://keybase.io/hashicorp

Debug Output

https://gist.github.com/AndreasFrangopoulos/aabb3e9fbeb83d81b2c094712be46702

Panic Output

Expected Behaviour

Actual Behaviour

TF should see that the refreshed data has not changed and not recreate the replicated vm.

From the debug output we can see that the data source is refreshing (I don't understand why this is happening either, but not important), but that data won't change, becuase the disk hasn't. I suspect that it's just the fact that there is a refresh happening that is causing the replacement. This could be avoided if we could reference the disk id directly from the vm resource rather than having to proxy through a data source, but this is also another story.

Here is a copy of the plan showing the new maanged disk data is identical to what was received from the Azure API in the debug output https://gist.github.com/AndreasFrangopoulos/c44577da3ce6c23bd820fd8fcee6e9ad

Steps to Reproduce

  1. terraform apply

Important Factoids

References

8416

I had to use the solution in this ticket to force lower case disk names as I was having the same problem.

mpjtaylor commented 2 years ago

I have implemented lower for the OS disk but I cannot get data disks to work :

Could not update the properties of the virtual machine 'xx' because of following invalid disk Ids

but when i check the State the id matches the case the error and matches the id of the disk in Azure.

mpjtaylor commented 2 years ago

I cannot get this to work with lower, plan always wants to replace the dr due to diskid

myc2h6o commented 2 years ago

This seems to be hitting the dependency behavior of Terraform data source https://www.terraform.io/language/data-sources#data-resource-dependencies if a data source directly depends on another managed resource, it will be recalculated during the apply phase because it doesn't know whether the upstream resource will change or not in the same apply. The workaround would be use a local value to create an independent dependency. e.g.: https://github.com/hashicorp/terraform-provider-azurerm/issues/13152#issuecomment-1050648704

And disk_id is case-insensitive, so I guess tolower is not needed https://github.com/hashicorp/terraform-provider-azurerm/blob/19096cfe686c968c3784d67657debed8ee54a93c/internal/services/recoveryservices/site_recovery_replicated_vm_resource.go#L145

mpjtaylor commented 2 years ago

Thanks for your reply, I have tried to do this but to no avail I cannot get it to work!

rcskosir commented 1 year ago

Thanks for opening this issue. This was a problem in the 2.x version of the provider which is no longer actively maintained. If this is still an issue with the 3.x version of the provider please do let us know by opening a new issue, thanks!

github-actions[bot] commented 4 months ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.