Telmate / terraform-provider-proxmox

Terraform provider plugin for proxmox
MIT License
2.15k stars 518 forks source link

Disk is created twice after a succesful apply. #832

Closed rinmeister closed 10 months ago

rinmeister commented 1 year ago

I create a vm from a template with cloud init using telmate 2.9.14 as a Terraform provider. In the disk section I give a type, storage and size. After the apply the VM is created with the correct disk. If I then do a terraform plan again it says that the disk will have be created with the specs given. If I apply that, an extra disk with the same specs is created again. Happens with every single VM I try to deploy. In 2.9.11 I don't have this.

kingfisher77 commented 1 year ago

If we call terraform apply again after the first apply, then the disk should be removed. A new apply says that both sides match. Another call again reports that the disk is removed. This is unpredictable. Sometimes this way and sometimes that way. When we confirm this, terraform apply runs, but the hard drive is still there.

This is really strange. We have terraform projects in Azure Cloud and in Hetzner Cloud. We have never experienced anything like this.

$ terraform providers
Providers required by configuration:
.
└── provider[registry.terraform.io/telmate/proxmox] 2.9.14

Providers required by state:

    provider[registry.terraform.io/telmate/proxmox]
`

$ terraform version Terraform v1.5.6 on darwin_arm64

This looks broken, or?

kingfisher77 commented 1 year ago

We clone form a template, the disk is the os disk.

kingfisher77 commented 1 year ago

Read again my post from yesterday. Not the clearest one ;-) To sum it up:

Terraform or Terraform Proxmox-Provider do makes randomly mistakes in compairing state by no changes in the code.

At one apply, the state is :

Acquiring state lock. This may take a few moments... proxmox_vm_qemu.vm: Refreshing state... [id=pan/qemu/100]

No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed.
Releasing state lock. This may take a few moments...

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

At the direct next apply it is:

proxmox_vm_qemu.vm: Refreshing state... [id=pan/qemu/100]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # proxmox_vm_qemu.vm will be updated in-place
  ~ resource "proxmox_vm_qemu" "vm" {
        id                        = "node/qemu/100"
        name                      = "vm-01"
        tags                      = "docker;ubuntu"
        # (40 unchanged attributes hidden)

      - disk {
          - backup             = true -> null
          - cache              = "none" -> null
          - file               = "100/vm-100-disk-0.qcow2" -> null
          - format             = "qcow2" -> null
          - iops               = 0 -> null
          - iops_max           = 0 -> null
          - iops_max_length    = 0 -> null
          - iops_rd            = 0 -> null
          - iops_rd_max        = 0 -> null
          - iops_rd_max_length = 0 -> null
          - iops_wr            = 0 -> null
          - iops_wr_max        = 0 -> null
          - iops_wr_max_length = 0 -> null
          - iothread           = 0 -> null
          - mbps               = 0 -> null
          - mbps_rd            = 0 -> null
          - mbps_rd_max        = 0 -> null
          - mbps_wr            = 0 -> null
          - mbps_wr_max        = 0 -> null
          - replicate          = 0 -> null
          - size               = "20G" -> null
          - slot               = 0 -> null
          - ssd                = 0 -> null
          - storage            = "local" -> null
          - storage_type       = "dir" -> null
          - type               = "virtio" -> null
          - volume             = "local:100/vm-100-disk-0.qcow2" -> null
        }

        # (4 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

Do you want to perform these actions?
pchang388 commented 1 year ago

I am also seeing a similar issue, have a VM with 2 extra disk and when I do a plan after creating, I get this:

$ terraform plan
module.lxc_module.module.dev_server.proxmox_lxc.dev_server: Refreshing state... [id=proxmox-3080m/lxc/100]
module.qemu_module.module.unifi.proxmox_vm_qemu.unifi_controller: Refreshing state... [id=proxmox-3080m/qemu/101]

No changes. Your infrastructure matches the configuration.

After doing a plan later with no changes, I see a new disk change and if I do an apply, a new disk is added and replaces the last one defined in the resource block. The previously used disk is then placed as a unused disk causing me unable to use that mountpoint/disk

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # module.qemu_module.module.unifi.proxmox_vm_qemu.unifi_controller will be updated in-place
  ~ resource "proxmox_vm_qemu" "unifi_controller" {
        id                        = "proxmox-3080m/qemu/101"
        name                      = "unifi-controller"
        # (35 unchanged attributes hidden)

      + disk {
          + backup             = true
          + cache              = "none"
          + discard            = "on"
          + iops               = 0
          + iops_max           = 0
          + iops_max_length    = 0
          + iops_rd            = 0
          + iops_rd_max        = 0
          + iops_rd_max_length = 0
          + iops_wr            = 0
          + iops_wr_max        = 0
          + iops_wr_max_length = 0
          + iothread           = 1
          + mbps               = 0
          + mbps_rd            = 0
          + mbps_rd_max        = 0
          + mbps_wr            = 0
          + mbps_wr_max        = 0
          + replicate          = 0
          + size               = "15G"
          + ssd                = 1
          + storage            = "local-ssd"
          + type               = "scsi"
        }

        # (3 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.
pchang388 commented 1 year ago

I can usually work around the issue by removing the local .terraform dir and terraform.lock.hcl file. Then issuing another init and it stops trying to add the disk again. But this issue returns again later on.

sgabenov commented 1 year ago

It is a workaround that will just prevent terraform trigger changes each time you run it. The problem is because tf proxmox provider does not identify disk correct by "id" and in this case the sequence of disk may change each time you run terraform. So, if you do not want to make a change on disk and just ignore errors, use lifecycle

btw, there is a PR to fix this, but still opened: https://github.com/Telmate/terraform-provider-proxmox/pull/794

resource "proxmox_vm_qemu" "_110_jenkins_test_slave01" {
  name        = "jenkins-test-slave01"

 ...

 disk {
    type     = "scsi"
    storage  = "local-lvm"
    cache    = "writethrough"
    iothread = 1
    discard  = "on"
    size     = "20G"
  }

  disk {
    type     = "scsi"
    storage  = "local-lvm"
    cache    = "writethrough"
    iothread = 1
    discard  = "on"
    size     = "30G"
  }

  lifecycle {
    ignore_changes = [disk]
  }
kingfisher77 commented 1 year ago

Yes, this helps. Cool. In my case, it is the OS disk which comes from the template and will not be changed in terraform.

sgabenov commented 1 year ago

Another thing, that can be tested is to use a dynamic block, so we can change the sequence to "set" that will not be reordered

resource "proxmox_vm_qemu" "_105_minio03" {   # Main config
  name        = "minio03"
  desc        = "Emb. Node cluster MinIO"
  vmid        = "105"
  target_node = "bf13"

  ...

  dynamic "disk" {
    for_each = toset(sort(["20G", "10G", "100G"]))
    content {
      type     = "scsi"
      storage  = "local-lvm"
      cache    = "writethrough"
      iothread = 1
      discard  = "on"
      size     = disk.value
    }
  }
} 
retoo commented 1 year ago

I think we can confirm, this has gotten much by upgrading to 2.9.14. i tried understanding the logic, but the main branch has quite some changes compare to 2.9.14.

github-actions[bot] commented 10 months ago

This issue is stale because it has been open for 60 days with no activity. Please update the provider to the latest version and, in the issue persist, provide full configuration and debug logs

github-actions[bot] commented 10 months ago

This issue was closed because it has been inactive for 5 days since being marked as stale.

retoo commented 10 months ago

This issue breaks the proxmox integration completely. It's unusable and a very frustrating experience.

esomore commented 10 months ago

same here

retoo commented 9 months ago

See also discussions in https://github.com/Telmate/terraform-provider-proxmox/issues/832