Terraform DSC does not produce Idempotent results - errors on re-run #682

Closed hashibot closed 4 years ago

hashibot commented 6 years ago

This issue was originally opened by @sada-kubsad as hashicorp/terraform#17080. It was migrated here as a result of the provider split. The original body of the issue is below.

Re-running a Terraform script with Azure DSC fails regardless of the DSC changes.

From looking at the DSC log files on the Azure VM, it seems that DSC is not picking up the new config changes.

It works when we deploy everything from scratch but fails when making incremental changes to DSC configs and relaunching from Terraform.

Please see below for details:

Terraform Version


Terraform Configuration Files

resource "azurerm_virtual_machine_extension"  "example_DSC" {
    name = "${var.customerAcronym}-${var.environment}-${var.machineAcronyms["example"]}-dsc"
    location = "${var.location}"
    resource_group_name = "${azurerm_resource_group.rg.name}"
    publisher ="Microsoft.Powershell"
    type ="DSC"
    type_handler_version = "${var.dsc_extension}"
    auto_upgrade_minor_version = true
    depends_on = ["azurerm_virtual_machine.example"]
    settings = <<SETTINGS
        "configuration": {
            "url": "${var.resourceStore["fileShareUrl"]}${var.resourceStore["dscArchiveName"]}${var.azureCredentials["storageKey"]}",
            "function": "example",
            "script": "example.ps1"
        "configurationArguments": {

    protected_settings = <<PROTECTED_SETTINGS
        "configurationArguments": {

Debug Output

azurerm_virtual_machine_extension.ArcGIS_Server_DSC: Still creating... (1m0s elapsed)

Error: Error applying plan:

1 error(s) occurred:

* azurerm_virtual_machine_extension.ArcGIS_Server_DSC: 1 error(s) occurred:

* azurerm_virtual_machine_extension.ArcGIS_Server_DSC: compute.VirtualMachineExtensionsClient#CreateOrUpdate: Failure sending request: StatusCode=200 -- Original Error: Long running operation terminated with status 'Failed': Code="VMExtensionProvisioningError" Message="VM has reported a failure when processing extension 'tst-dev-ags-dsc'. Error message: \"The DSC Extension failed to install: An error occurred while executing script or module 'webgis.ps1':  At C:\\Packages\\Plugins\\Microsoft.Powershell.DSC\\\\DSCWork\\DSC.0\\webgis.ps1:77 char:2\r\n+     Import-DscResource -Name ArcGIS_Service_Account\r\n+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\nUnable to load resource 'ArcGIS_Service_Account': Resource not found.\r\n\r\nAt C:\\Packages\\Plugins\\Microsoft.Powershell.DSC\\\\DSCWork\\DSC.0\\webgis.ps1:78 char:5\r\n+     Import-DscResource -Name ArcGIS_License\r\n+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\nUnable to load resource 'ArcGIS_License': Resource not found.\r\n\r\nAt C:\\Packages\\Plugins\\Microsoft.Powershell.DSC\\\\DSCWork\\DSC.0\\webgis.ps1:79 char:5\r\n+     Import-DscResource -Name MSFT_xFirewall\r\n+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\nUnable to load resource 'MSFT_xFirewall': Resource not found.\r\n\r\nAt C:\\Packages\\Plugins\\Microsoft.Powershell.DSC\\\\DSCWork\\DSC.0\\webgis.ps1:80 char:5\r\n+     Import-DscResource -Name MSFT_xDisk\r\n+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\nUnable to load resource 'MSFT_xDisk': Resource not found.\r\n\r\nAt C:\\Packages\\Plugins\\Microsoft.Powershell.DSC\\\\DSCWork\\DSC.0\\webgis.ps1:81 char:2\r\n+ Import-DscResource -Name ArcGIS_Portal\r\n+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\nUnable to load resource 'ArcGIS_Portal': Resource not found.\r\n\r\nAt C:\\Packages\\Plugins\\Microsoft.Powershell.DSC\\\\DSCWork\\DSC.0\\webgis.ps1:82 char:2\r\n+     Import-DscResource -Name ArcGIS_DataStore\r\n+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\nUnable to load resource 'ArcGIS_DataStore': Resourcenot found.\r\n\r\nAt C:\\Packages\\Plugins\\Microsoft.Powershell.DSC\\\\DSCWork\\DSC.0\\webgis.ps1:83 char:2\r\n+     Import-DscResource -Name ArcGIS_GeoEvent\r\n+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\nUnable to load resource 'ArcGIS_GeoEvent': Resource not found.\r\n\r\nAt C:\\Packages\\Plugins\\Microsoft.Powershell.DSC\\\\DSCWork\\DSC.0\\webgis.ps1:84 char:2\r\n+     Import-DscResource -Name ArcGIS_Server\r\n+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\nUnable to load resource 'ArcGIS_Server': Resource not found.\r\n\r\nAt C:\\Packages\\Plugins\\Microsoft.Powershell.DSC\\\\DSCWork\\DSC.0\\webgis.ps1:85 char:2\r\n+     Import-DscResource -Name ArcGIS_Federation\r\n+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\nUnable to load resource 'ArcGIS_Federation': Resource not found.\r\n\r\nAt C:\\Packages\\Plugins\\Microsoft.Powershell.DSC\\\\DSCWork\\DSC.0\\webgis.ps1:86 char:2\r\n+     Import-DscResource -Name ArcGIS_Server_TLS\r\n+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\nUnable to load resource 'ArcGIS_Server_TLS': Resource not found.\r\n\r\nNot all parse errors were reported.  Correct the reported errors and try again..\nMore information about the failure can be found in the logs located under 'C:\\WindowsAzure\\Logs\\Plugins\\Microsoft.Powershell.DSC\\' on the VM.\nTo retry install, please remove the extension from the VM first. \"."

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

Crash Output

Expected Behavior

Re-running the same terraform script produces idempotent results with DSC.

Actual Behavior

Steps to Reproduce

Additional Context


sada-kubsad commented 6 years ago

From further testing, we get the feeling this is related to Terraform due to the way that TF works (state file). The issue goes away when launching the same DSC via ARM templates

If we successfully deploy a template that includes a DSC extension and then decide to add or modify the DSC config in the .zip, TF has no connection to the DSC config and as such, when running 'terraform plan', it sees nothing new. The only way to run the template again so that the DSC config is picked up is to 'taint' the DSC extension resource and then run 'terraform plan'.

This is in contrast to what ARM does which is simply push the config again even if the dsc extension is already detected.

The issue we are running into could possibly be fixed by having TF recognize that the extension failed and automatically remove it from the VM because as of now, when it fails, the DSC extension is still configured but in a failed state which is viewable in the Azure Portal.

Thanks to @PleaseStopAsking for these findings

683 : Could be potentially related

neil-yechenwei commented 4 years ago

Thanks for opening this issue. After tested, seems I can update vm extension successfully with latest azurerm provider. So I cannot repro it with below tfconfig anymore. Could you have a try below tfconfig to check whether the issue still exists? Thanks.

provider "azurerm" {
  features {}

resource "azurerm_resource_group" "test" {
  name     = "acctestRG-vme-dsc-test01"
  location = "eastus2"

resource "azurerm_virtual_network" "test" {
  name                = "acctvn-test01"
  address_space       = [""]
  location            = azurerm_resource_group.test.location
  resource_group_name = azurerm_resource_group.test.name

resource "azurerm_subnet" "test" {
  name                 = "acctsub-test01"
  resource_group_name  = azurerm_resource_group.test.name
  virtual_network_name = azurerm_virtual_network.test.name
  address_prefix       = ""

resource "azurerm_network_interface" "test" {
  name                = "acctnic-test01"
  location            = azurerm_resource_group.test.location
  resource_group_name = azurerm_resource_group.test.name

  ip_configuration {
    name                          = "testconfiguration1"
    subnet_id                     = azurerm_subnet.test.id
    private_ip_address_allocation = "Dynamic"

resource "azurerm_virtual_machine" "test" {
  name                  = "acctvmNeil"
  location              = azurerm_resource_group.test.location
  resource_group_name   = azurerm_resource_group.test.name
  network_interface_ids = [azurerm_network_interface.test.id]
  vm_size               = "Standard_F4"

  storage_image_reference {
    publisher = "MicrosoftWindowsServer"
    offer     = "WindowsServer"
    sku       = "2016-Datacenter"
    version   = "latest"

  storage_os_disk {
    name              = "myosdisk1"
    caching           = "ReadWrite"
    create_option     = "FromImage"
    managed_disk_type = "Standard_LRS"

  os_profile {
    computer_name  = "acctvmtest01"
    admin_username = "testadmin"
    admin_password = "Password1234!"

  os_profile_windows_config {
    timezone           = "Pacific Standard Time"
    provision_vm_agent = true

resource "azurerm_virtual_machine_extension"  "test" {
    name                       = "vmextension-dsc-test01"
    virtual_machine_id         = azurerm_virtual_machine.test.id
    publisher                  = "Microsoft.Powershell"
    type                       = "DSC"
    type_handler_version       = "2.26"
    auto_upgrade_minor_version = true

    settings = <<SETTINGS
        "configuration": {
            "url": "https://github.com/Owen-Davies/Azure-VM-WMF-Speed-Test/raw/master/ExampleDSC.zip",
            "function": "ExampleDSC",
            "script": "ExampleDSC.ps1"

    protected_settings = <<PROTECTED_SETTINGS
        "configurationArguments": {

    tags = {
      env = "test"

    depends_on = ["azurerm_virtual_machine.test"]
