bpg / terraform-provider-proxmox

Terraform / OpenTofu Provider for Proxmox VE
https://registry.terraform.io/providers/bpg/proxmox
Mozilla Public License 2.0
892 stars 140 forks source link

Migrating a container results in it being recreated #1665

Open duntonr opened 10 hours ago

duntonr commented 10 hours ago

Describe the bug Providor attempts to remake an existing container if the container was migrated to another node

To Reproduce Requires a 2+ node cluster

  1. Instantiate the provider using node A's IP, eg
    provider "proxmox" {
    username = var.proxmox_username
    password = var.proxmox_password
    endpoint = "https://${local.proxmox_hosts.px01.ip}:8006/"
    insecure = true
    }
  2. Create a proxmox_virtual_environment_container resource on node A and ignore changes to the node_name on lifecycle

resource "proxmox_virtual_environment_container" "atlantis-container" { cpu { cores = 2 } node_name = "px01" description = local.cts.atlantis.description vm_id = local.cts.atlantis.vm_id disk { datastore_id = local.infra.ceph_pool_01_name size = 25 } operating_system { template_file_id = "px01-dir:vztmpl/debian-12-standard_12.7-1_amd64.tar.zst" type = "debian" } memory { dedicated = 1024 swap = 512 } network_interface { name = "eth0" bridge = local.cts.atlantis.bridge enabled = "true" } pool_id = proxmox_virtual_environment_pool.lxc-util-pool.pool_id features { nesting = true } initialization { dns { domain = local.infra.primary_search_domain servers = [local.infra.dns_piehole_ip, local.cts.atlantis.gateway, local.infra.dns_fallback_ip] } ip_config { ipv4 { address = "${local.cts.atlantis.ip}/24" gateway = local.cts.atlantis.gateway } } hostname = local.cts.atlantis.hostname user_account { keys = [var.ssh_public_key] password = var.root_password } }

lifecycle { ignore_changes = [ node_name, operating_system, disk["datastore_id"] ] } }


3. Run `tofu apply` to bring the container up
4. manually migrate the container from Node A to Node B
5. Run `tofu plan` again and note it wants to recreate the resource 
![image](https://github.com/user-attachments/assets/e31412c6-2395-4ee7-a45a-7906add5398d)

**Expected behavior**
No new container creation as the container was already created, it just exists on a different node

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Additional context**
The bug seems to be that the provider will get the old node_name from tfstate and only check that node to see if the container is present.  If not, it will try to recreate it on that node.  With the `ignore_changes` on the `node_name` field set, it doesn't matter what goes in the tf application itsself, the provider will pull the node to check from the state and only check that node.  I confirmed this by manually editing the state to contain the new node and a create was NOT triggered with a new plan.

With normal migrations, the container could be on any node in the cluster, so it may be more robust to see if the `vm_id` for the container already exists on any node in the cluster when deciding if a new create is needed or not

- Clustered Proxmox
- Proxmox version: 8.2.7
- Provider version (ideally it should be the latest version): 0.66.3
- Terraform/OpenTofu version: tofu 1.8.6
- OS (where you run Terraform/OpenTofu from): ubuntu
- Debug logs (`TF_LOG=DEBUG terraform apply`):
duntonr commented 10 hours ago

This looks similar to #658 but with the addition of use of ignore_changes