Provider tries to replace VMs even when no changes exist

lenaxia commented 1 year ago

I was previously on 2.7.4 and didn't see this issue, but since moving v2.9.11 I've seen this, and going back to 2.7.4 now sees this issue too.

This is the config for one of my VMs:

terraform {
  required_providers {
    proxmox = {
      source = "telmate/proxmox"
      version = "2.9.11"
    }
  }
}

provider "proxmox" {
  pm_api_url = "https://192.168.3.1:8006/api2/json"
  pm_api_token_id = "terraform@pam!token20220405"
  pm_api_token_secret = "secret"
  pm_tls_insecure = true

  pm_log_enable = true
  pm_log_file = "terraform-plugin-proxmox.log"
  pm_log_levels = {
    _default = "debug"
    _capturelog = ""
  }
}

##########
## Server 00
##########

resource "proxmox_vm_qemu" "k3-server-00" {
  count = var.k3_server_count
  name = "${format("k3-server-%02s", count.index + var.k3_server00_offset)}"
  target_node = var.k3_server00_host
  clone = var.template_name
  agent = 1
  os_type = "cloud-init"
  qemu_os = "Linux"
  cores = 3
  sockets = 1
  cpu = "host"
  memory = 12872
  scsihw = "virtio-scsi-pci"
  bootdisk = "scsi0"
  onboot = "true"
  disk {
    slot = 0
    size = var.k3_server_disksize
    type = "scsi"
    storage = "local-zfs"
    iothread = 1
  }
  network {
    model = "virtio"
    bridge = "vmbr0"
  }

  lifecycle {
    ignore_changes = [
      network
    ]
  }
  ipconfig0 = "${format("ip=192.168.2.%02s/22,gw=192.168.0.1", count.index + var.k3_server00_offset + var.k3_server_base_offset)}"
  sshkeys = <<EOF
  ${var.ssh_key_terraform}
  EOF
}

Off of a fresh creation, if I just run terraform plan, and if I run a terraform apply --refresh-only, I immediately get this:

$ terraform plan -target=proxmox_vm_qemu.k3-server-00[0]
proxmox_vm_qemu.k3-server-00[0]: Refreshing state... [id=lafiel/qemu/102]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # proxmox_vm_qemu.k3-server-00[0] is tainted, so must be replaced
-/+ resource "proxmox_vm_qemu" "k3-server-00" {
      + default_ipv4_address      = (known after apply)
      - disk_gb                   = 0 -> null
      ~ id                        = "lafiel/qemu/102" -> (known after apply)
        name                      = "k3-server-00"
      + nameserver                = (known after apply)
      ~ qemu_os                   = "other" -> "Linux"
      ~ reboot_required           = false -> (known after apply)
      + searchdomain              = (known after apply)
      + ssh_host                  = (known after apply)
      + ssh_port                  = (known after apply)
      ~ sshkeys                   = <<-EOT
              ssh-rsa SSHKEY ubuntu@terraform
          -
        EOT
      ~ unused_disk               = [] -> (known after apply)
      + vmid                      = (known after apply)
        # (31 unchanged attributes hidden)

      ~ disk {
          ~ file               = "vm-102-disk-0" -> (known after apply)
          ~ format             = "raw" -> (known after apply)
          + media              = (known after apply)
          ~ storage_type       = "zfspool" -> (known after apply)
          ~ volume             = "local-zfs:vm-102-disk-0" -> (known after apply)
            # (23 unchanged attributes hidden)
        }

      ~ network {
          ~ macaddr   = "6E:23:FF:C0:9E:BB" -> (known after apply)
          - mtu       = 0 -> null
          ~ queues    = 0 -> (known after apply)
          ~ rate      = 0 -> (known after apply)
            # (5 unchanged attributes hidden)
        }
    }

Plan: 1 to add, 0 to change, 1 to destroy.

The things that jump out at me are that these properties are not set by me in my main.tf, and yet they are forcing a replacement:

disk_gb - which seems to have been deprecated, so where is this coming from?
sshkeys - it seems to be adding a new line at the end of the file upon creation. And even if I manually add it in the beginning, it adds another one and creates a difference
network.mtu - same as disk_gb, no idea where this is coming from

I've tried setting disk_gb to disk_gb = null in main.tf, however it doesn't change anything, and trying to set it to 0 conflicts with disk.size so that's not possible. And I don't think setting mtu = 0 permanently is the right move.

I've also tried setting the lifecycle to ignore thse changes, and it doesn't have an effect either, it still wants to replace:

  lifecycle {
    ignore_changes = [
      network, disk_gb, network[0].mtu
    ]
  }

Running terraform apply --refresh-only multiple times will report No changes. Your infrastructure still matches the configuration., but running plan immediately tries to replace.

Looking for help so I can make in-place changes to my VMs. It's not reasonable for me to recreate every time.

sebdanielsson commented 1 year ago

Having the same issue, found this issue after some googling.

terraform {
  required_providers {
    proxmox = {
      source  = "telmate/proxmox"
      version = "2.9.11"
    }
  }
}

provider "proxmox" {
    pm_api_url = "https://mediaserver:8006/api2/json"
    pm_user = "terraform-prov@pve"
    pm_password = "MySuperSecretPassword123!"
}

# Media server VM
resource "proxmox_vm_qemu" "mediaserver-tf" {
    # The name of the VM
    name = "hogsmeade-tf"

    # Node to deploy the VM on
    target_node = "hogsmeade"

    # Template name to clone this VM from
    clone = "fedora-template"
    full_clone = true

    # VM boot policy
    oncreate = true
    onboot = true
}

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 60 days with no activity. Please update the provider to the latest version and, in the issue persist, provide full configuration and debug logs

sebdanielsson commented 1 year ago

Still relevant.

chrisbenincasa commented 1 year ago

I'm running into this as well. Not sure if this is is specific to cloning, but I've experiencing it when cloning VMs.

Very minimal example:

resource "proxmox_vm_qemu" "docker1" {
  provider    = proxmox.hera
  name        = "docker1"
  target_node = "hera"
  clone       = "debian-11.7-2023-05-13-15-00-28"

  memory = 512
  agent = 1

  cicustom = "user=local:snippets/user_data_docker_vm.yml"

  lifecycle {
    create_before_destroy = true
  }
}

And the diff (when I've changed nothing):

  # proxmox_vm_qemu.docker1 will be updated in-place
  ~ resource "proxmox_vm_qemu" "docker1" {
      - agent                     = 1 -> null
      - desc                      = "Debian 11.7 base template. Generated at 2023-05-13T15:00:28Z" -> null
        id                        = "hera/qemu/101"
        name                      = "docker1"
      - qemu_os                   = "l26" -> null
        # (29 unchanged attributes hidden)

      - network {
          - bridge    = "vmbr0" -> null
          - firewall  = false -> null
          - link_down = false -> null
          - macaddr   = "AE:08:FC:F6:9E:52" -> null
          - model     = "virtio" -> null
          - mtu       = 0 -> null
          - queues    = 0 -> null
          - rate      = 0 -> null
          - tag       = -1 -> null
        }
    }

t0xa commented 1 year ago

I'm also experiencing the same issue.

To reproduce:

Create a VM template with cloud-init functionality enabled
Create a full_clone VM using proxmox terraform provider and cicustom

resource "proxmox_vm_qemu" "cloudinit-test" {
  name        = "vm4"
  desc        = "Test machine"
  target_node = "pm6"

  full_clone = true
  clone      = "ubuntu-jammy-template"
  onboot     = true
  agent = 1

  pool  = "vms"

  os_type = "cloud-init"
  cores   = 4
  sockets = 1
  cpu     = "host"
  memory  = 2048
  scsihw  = "virtio-scsi-pci"

  disk {
    type    = "scsi"
    storage = "local-lvm"
    size    = "32G"
  }

  network {
    model  = "virtio"
    bridge = "vmbr0"
    tag    = 100
  }

  cicustom = "user=local:snippets/user_data.yml,network=local:snippets/net100.yml"

}

The VM is created correctly
Run terraform apply again without changing anything and it will show following diff:

 # proxmox_vm_qemu.cloudinit-test-4 will be updated in-place
  ~ resource "proxmox_vm_qemu" "cloudinit-test" {
      - ciuser                    = "ubuntu" -> null
        id                        = "pm6/qemu/100"
      - ipconfig0                 = "ip=dhcp" -> null
        name                      = "vm4"
      - qemu_os                   = "other" -> null
        # (37 unchanged attributes hidden)

        # (2 unchanged blocks hidden)
    }

@chrisbenincasa Adding ignore_changes helps in this case:

lifecycle {
    ignore_changes = [
      ipconfig0, qemu_os, ciuser
    ]
  }

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 60 days with no activity. Please update the provider to the latest version and, in the issue persist, provide full configuration and debug logs

lenaxia commented 1 year ago

I've completely migrated away from proxmox because this has rendered proxmox entirely unusable for me.

t0xa commented 1 year ago

Hi @lenaxia

Out of curiosity, what is your setup looks like now? ESXi? vCenter?

Thanks

FlexibleToast commented 1 year ago

I had this same issue. I used the plan to add a bunch of stuff to my resource. Eventually the only thing that wanted to change was qemu_os. Even if I set it to the value it was trying to change it to it would still say it was a change and it would fail because of permissions. Adding the lifecycle ignore_changes for qemu_os fixed that. Why is qemu_os always trying to change and what permission would it need that Administrator isn't enough?

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 60 days with no activity. Please update the provider to the latest version and, in the issue persist, provide full configuration and debug logs

sebdanielsson commented 1 year ago

Still relevant

github-actions[bot] commented 11 months ago

This issue is stale because it has been open for 60 days with no activity. Please update the provider to the latest version and, in the issue persist, provide full configuration and debug logs

sebdanielsson commented 11 months ago

Still relevant.

lenaxia commented 10 months ago

Hi @lenaxia

Out of curiosity, what is your setup looks like now? ESXi? vCenter?

Thanks

Sorry thought I had responded. I am running bare metal now. I was using proxmox for my k3s cluster and just moved it to bare metal. I have a single proxmox node now that I run for non essential vms that I manually create. I no longer use terraform for any proxmox operations.

github-actions[bot] commented 8 months ago

This issue is stale because it has been open for 60 days with no activity. Please update the provider to the latest version and, in the issue persist, provide full configuration and debug logs

github-actions[bot] commented 6 months ago

This issue is stale because it has been open for 60 days with no activity. Please update the provider to the latest version and, in the issue persist, provide full configuration and debug logs

github-actions[bot] commented 5 months ago

This issue was closed because it has been inactive for 5 days since being marked as stale.

yelinaung commented 5 months ago

Still relevant

Telmate / terraform-provider-proxmox

Provider tries to replace VMs even when no changes exist #655