hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.58k stars 4.62k forks source link

windows_virtual_machine_scale_set SKU(Size) change cause whole VMSS resource redeployment #18103

Closed SlavisaBakicOB closed 8 months ago

SlavisaBakicOB commented 2 years ago

Is there an existing issue for this?

Community Note

Terraform Version

1.2.7

AzureRM Provider Version

3.19.1

Affected Resource(s)/Data Source(s)

azurerm_windows_virtual_machine_scale_set

Terraform Configuration Files

resource "azurerm_windows_virtual_machine_scale_set" "vmss_winsrv" {
    tags = merge(local.default_tags)
    resource_group_name         = var.resource_group_name
    name                        = local.vmss_name
    location                    = var.location
    provision_vm_agent          = true
    enable_automatic_updates    = true
    admin_password = var.admin_password
    admin_username = var.admin_username

    computer_name_prefix = "vmsstest"

    sku = var.vmss_sku  

    instances = var.instance_count

    upgrade_mode = "Manual"

    os_disk {
        caching              = lookup(var.storage_os_disk_config, "caching", "ReadWrite")
        storage_account_type = lookup(var.storage_os_disk_config, "storage_account_type", "Standard_LRS")
        disk_size_gb         = lookup(var.storage_os_disk_config, "disk_size_gb", 127)
        write_accelerator_enabled = false
    }

    source_image_reference {
        publisher      = lookup(var.vm_image, "publisher", "MicrosoftWindowsServer")
        offer          = lookup(var.vm_image, "offer", "WindowsServer")
        sku            = lookup(var.vm_image, "sku", "2019-Datacenter")
        version        = lookup(var.vm_image, "version", "latest")
    }

    network_interface {
         name    = "vm_scale_set_net"
         primary = true        
         ip_configuration {
             name      = "internal"
             primary   = true
             subnet_id = var.vmss_subnet_id
             load_balancer_backend_address_pool_ids = [azurerm_lb_backend_address_pool.vmss_lb_bep.id]

         }
     }

  lifecycle {
    ignore_changes = [
     tags,
     resource_group_name,
     name,
     provision_vm_agent,
     enable_automatic_updates,
     admin_password,
     admin_username,
     computer_name_prefix,
     instances,
     upgrade_mode,
     os_disk,
     source_image_reference,
     network_interface 
    ]
  }
}

Debug Output/Panic Output

azurerm_windows_virtual_machine_scale_set.vmss_winsrv: Modifications complete after 5m0s [id=/subscriptions/*ommited*/resourceGroups/*ommited*/providers/Microsoft.Compute/virtualMachineScaleSets/*ommited*]

Expected Behaviour

TF should just change size if only sku attribute is changed without redeploying whole VMSS resource. Manually chaning VMSS Size via Portal is completed in less than 10 seconds and VMSS is not recreated.

Actual Behaviour

TF redeployed whole VMSS resource which can take up to 5 minutes instead of just changing VMSS sku which should not take more than 20 seconds.

Steps to Reproduce

No response

Important Factoids

No response

References

No response

myc2h6o commented 2 years ago

Hi @SlavisaBakicOB thanks for opening the issue! When upgrade_mode is set to Manual, changing sku on Azure Portal is quick since it doesn't update the sku for existing VM in the scale set. In Terraform however, it is designed to run without manual operation outside Terraform, thus even with the Manual upgrade mode, we still do an update to the VM inside the scale set. https://github.com/hashicorp/terraform-provider-azurerm/blob/c7e7b2b1982274cb7c509e495934d2a60dab18a0/internal/services/compute/virtual_machine_scale_set_update.go#L59-L60

geglne commented 1 year ago

This is unfortunate. As SlavisaBakicOB said above, when changing the sku - not unlike when manually changing it in the portal - it would be nice to preserve the OS disk.

For me, the speed of the deployment isn't an issue. It is our customers who have changed the VM sku (for whatever reason they have), and are surprised when their OS disk has been re-formated.

I realize that a VM and a VMSS are separate resources altogether, however the virtual_machine has a delete_os_disk_on_termination argument and it would be nice to have a similar flag like that.

Workaround

myc2h6o commented 1 year ago

@geglne I've been adding a feature flag to skip the reimage process during the upgrade when upgrade_mode is Manual with #22975. By specifying its value to false, the OS disk shall be preserved after the update.

github-actions[bot] commented 5 months ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.