prefapp / tfm

Reusable Terraform modules
0 stars 0 forks source link

[BUG]: Terraform sometimes doesn't automatically upgrade a VMSS instance when updating a custom extension script #119

Closed juanjosevazquezgil closed 2 weeks ago

juanjosevazquezgil commented 1 month ago

Motivation

We found a bug recently in one of our clients. They have the following setup:

The error happened when following these steps:

  1. Create a PR where the custom extension script of a VMSS is updated, so it looks for and downloads a non-existing image
  2. Commit that change, apply it with Terraform and upload it to Azure. The VMSS should fail to start
  3. Create another PR where the previous error is fixed
  4. Commit that change, apply it with Terraform and upload it to Azure. The VMSS should still fail to start, with the same error as step 2
  5. To fix this, you must manually upgrade each VM in the VMSS

It seems like Terraform/Azure only upgrades VMs when no error is present in its custom extension scripts (at least if that script is used to start the VM). We need to prove this is the case, investigate why it happens and how to fix it

Acceptance criteria

juanjosevazquezgil commented 1 month ago

Confirmed: the steps described in this issue can be used to reproduce the error

alambike commented 2 weeks ago

Solved changing the upgrade mode from Manual (default) to Automatic:

https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/linux_virtual_machine_scale_set#upgrade_mode

image