Telmate / terraform-provider-proxmox

Terraform provider plugin for proxmox
MIT License
2.19k stars 531 forks source link

Cloud-init drive delete when cloning vm with cloud-init drive #901

Open cwilliams001 opened 9 months ago

cwilliams001 commented 9 months ago

Environment:

Proxmox Version: 8.1.3 Terraform Provider: thegameprofi/proxmox Provider Version: 2.10.0

Description:

I am using the @TheGamerProfi fork of the Proxmox provider due to compatibility issues with Proxmox version 8.1.3. While this fork solves the plugin crash, I have encountered a specific issue related to cloning VM templates that have a cloud-init drive attached.

Issue:

When creating a new VM by cloning a template with an attached cloud-init drive, the cloud-init drive gets deleted after the VM is created and following its move and resize operations.

Workaround:

Based on the discussion in issue #704, I applied a workaround that involves ensuring the disk type in the Terraform resource matches the disk type in the VM template. Specifically, if the VM template uses a virtio disk, the newly created VM should also use a virtio disk type to avoid boot loop issues. Additionally, I added cloudinit_cdrom_storage = "local-lvm" to the configuration.

Current Working Configuration:

resource "proxmox_vm_qemu" "hashi-server" {
  count       = var.server_count
  name        = "hashi-server-0${count.index + 1}"
  desc        = "hashi-server${count.index + 1}"
  vmid        = var.server_vmid_begin + count.index + 1
  target_node = var.proxmox_host
  onboot      = true
  clone    = var.template_name
  cloudinit_cdrom_storage = "local-lvm"
  full_clone  = true
  agent    = 1
  os_type  = "cloud-init"
  cores    = var.server_cores
  sockets  = 1
  cpu      = "host"
  memory   = var.server_memory
  scsihw   = "virtio-scsi-pci"
  bootdisk = "scsi0"
  disks {
    virtio {
      virtio0 {
        disk {
          size = var.server_disk_size
          storage = var.proxmox_storage
        }
      }
    }
  }
  network {
    bridge = var.proxmox_network_device
    model  = "virtio"
  }

  ipconfig0 = "ip=10.70.7.10${count.index + 1}/16,gw=10.70.0.1"
  # sshkeys   = <<EOF
  #   ${var.ssh_key}
  #   EOF
}

Steps to Reproduce:

Clone a VM template with an attached cloud-init drive using the specified Terraform provider and version.

Observe the deletion of the cloud-init drive post-creation and resizing of the VM

Expected Behavior:

The cloud-init drive should remain attached to the VM after cloning and resizing processes are completed.

Actual Behavior:

The cloud-init drive is removed after the VM creation process, leading to further issues such as boot loops unless the workaround is applied.

TheGameProfi commented 9 months ago

I can replicate the issue, and try to figure out why its happening.

TheGameProfi commented 9 months ago

I was able to get a closer look whats happening.

There is some kind of update of the VM after the rezize: update VM 104: -agent 1 -bios seabios -cores 1 -cpu host -description terraform test -hotplug network,disk,usb -ide2 none,media=cdrom -kvm 1 -memory 2048 -name Could be related to the changes of this commit: 90e6dba2013de3aeb4db8395daa3fad371b9d621

Tinyblargon commented 9 months ago

@TheGameProfi could you provide the following:

mleone87 commented 9 months ago

@TheGameProfi @Tinyblargon confirm, disk gets removed after the resize, probabily the ide2 is not parsed correctly Screenshot 2024-01-19 at 12 19 09

mleone87 commented 9 months ago

@Tinyblargon

https://github.com/Telmate/proxmox-api-go/blob/cd419d1e45db3a4aa25ca0a20bbe26d9b0428aad/proxmox/config_qemu_disk_ide.go#L106

not sure what's happening there since the function is too complicate but it's basically completely ignoring the cloudinit disk got from the api response after the clone. I also find somewhere in the code cloudinit called as ide3 but this is not correct, it should be ide2

My other guess is also that this function

https://github.com/Telmate/proxmox-api-go/blob/cd419d1e45db3a4aa25ca0a20bbe26d9b0428aad/proxmox/config_qemu_disk.go#L804

does not understand if a disk is completely new to it, like our scenario

side note, cloudinit drive in clone should be merged in new config as it is, not managed as a modified/updated disk

managed to get it to work but there is also another issue in parsing ipconfig that makes the provider go timeout that I'll explore later

Tinyblargon commented 9 months ago

@mleone87 gonna run some tests, as I'm unsure if the bug is in Terraform > proxmox-go-api or in proxmox-go-api > proxmox.

mleone87 commented 9 months ago

@Tinyblargon I "fixed" it modifying the api, as far as I can tell the proxmox code is doing well in that part

Tinyblargon commented 9 months ago

@mleone87 do you mean the proxmox-go-api?

mleone87 commented 9 months ago

@Tinyblargon yes Sorry, I'll submit a PR later to show

Tinyblargon commented 9 months ago

@mleone87 How it was intended to work is

  1. Clone vm
  2. Get vm config
  3. Delete and add disks
  4. Start create cloudinit config
  5. Start vm

I think I've figured out what is happening. When the removal of the cloud-init disk becomes a pending change, you can't add another disk as technically the old cloud-init disk is still there. To fix this, we would first have to delete it, then add disks.

So step 3 should become delete then add.

When my template doesn't have a cloud-init disk or it's there as ide3, it is created/remains as ide3.

cosminmocan commented 9 months ago

Just stumbled upon this issue, and boy is it annoying, first i had an issue where my disk kept on getting unmounted when trying to mount it using virtio instead of scsi0, now this is the next challenge :)) . What I have noticed is that the example directory also has a file for the cloud-init example that is 3 years old, with the old disk definition. As for this issue, I am currently using the compiled master branch plugin, is this bug already solved on another tag/branch or should we wait a bit.

Huge thank you for everyone involved in the creation and maintenance of this awesome provider!

Tinyblargon commented 9 months ago

@mleone87 okay so setting ciuser fixed it for me, this is definitely an issue with the mapping between Terraform and proxmox-go-api. https://github.com/Telmate/terraform-provider-proxmox/blob/47df793ca73c899fd599975e023de081ca86fac4/proxmox/resource_vm_qemu.go#L2336-L2344

On another note https://github.com/Telmate/proxmox-api-go/issues/299 fixes an issue you run into when you set ciuser. We try to add an new cloud-init disk while the old one technically still exists, since there may only be one cloud-init disk updating fails.

mleone87 commented 9 months ago

@Tinyblargon in my template I have the ciuser but I usually do not set the cloudinit_cdrom_storage that make that block effective, so it should be set as a default and only explicitly modified if needed?

Also on line 2339 you set cdrom id to be ide3 but as per proxmox api docs configuration it should be ide2, at least for the first cdrom set

Tinyblargon commented 9 months ago

@mleone87 Well, since cloudinit_cdrom_storage is dependent on the names of the available storages, it's best to leave this empty by default. The easiest solution will be to add to the docs that cloudinit_cdrom_storage should be specified along with a cloud-init setting to add the cloud-init disk.

mleone87 commented 9 months ago

@Tinyblargon there is still something that does not compute well around cloudinit drive, at least for me.

Tinyblargon commented 9 months ago

@mleone87 we can just deprecate the old way of configuring the cloud-init and cd-rom disk and merge them into the new disks config.

The reason i kept the old way was mostly for backwards compatibility and ease of use.

I reserved ide2 and ide3 is because before the disks would be assigned kinda randomly, since it was an array that got kinda converted to the proxmox disk structure.

I'm okay with either way of configuring it.

electropolis commented 9 months ago

I see I'm struggling the same problem but mention that in different issue https://github.com/Telmate/terraform-provider-proxmox/issues/922

I have templated with ide2 as cloudinit but during the setup with terraform it disappeared. Everything was fine when 2.9.3 provider used with Proxmox v7.1. Suddenly everything crashed when Proxmox v8.1.3 came with the latest provider. So what is the solution here guys because I'm confused. Maybe a solution? Because all tutorials for creating template are not capable to work with terraform provider.

When creating I've set the template to ide3 but When starting process of creation you see the cloudinit disk below:

Screenshot 2024-02-06 at 00 14 12 3

Next The disk is attached next to cloudinit. (From template it has that default base-vmid-0 disk). @

Screenshot 2024-02-06 at 00 14 32

And suddenly after cloudinit in terraform is ready and prepared it not attached. I've seen like cloudinit was crossed to be removed and suddenly it disappear in next iteration of requests and result looks like this

Screenshot 2024-02-06 at 00 15 05

ide2 with none cd-rom.

devZer0 commented 9 months ago

this worked for me https://github.com/Telmate/terraform-provider-proxmox/issues/704#issuecomment-1936723112

electropolis commented 9 months ago

this worked for me #704 (comment)

I will check that but there is scenario about not attaching cloudinit to template at all and I think I already tried that and didn't worked

devZer0 commented 9 months ago

i have not attached cloudinit to template at all. it just works after adding this param

electropolis commented 9 months ago

i have not attached cloudinit to template at all. it just works after adding this param

You mean when cloning the template and setting the cloudinit drive ?

devZer0 commented 9 months ago

i don't need to set a cloudinit drive, i only have

os_type  = "cloud-init"
cloudinit_cdrom_storage = "local-zfs"

a cloudinit ide drive magically appears without doing any further.

electropolis commented 9 months ago

i don't need to set a cloudinit drive, i only have

os_type  = "cloud-init"
cloudinit_cdrom_storage = "local-zfs"

a cloudinit ide drive magically appears without doing any further.

Yeah but do you set the cloudinit also in template? I'm talking about Template preparing procedure https://pve.proxmox.com/wiki/Cloud-Init_Support#_preparing_cloud_init_templates its described here and there is chapter Add Cloud-Init CD-ROM drive

The next step is to configure a CD-ROM drive, which will be used to pass the Cloud-Init data to the VM. qm set 9000 --ide2 local-lvm:cloudinit

devZer0 commented 9 months ago

i have no cloudinit drive in template, nor do i add cdrom/cloudinit drive in terraform configuration. it simply/magically works without, as cloudinit drive seems to be added by terraform, though.

electropolis commented 9 months ago

[...] nor do i add cdrom/cloudinit drive in terraform configuration.

How's so ? You add it here

i don't need to set a cloudinit drive, i only have

os_type  = "cloud-init"
cloudinit_cdrom_storage = "local-zfs"

That's cloudinit added in terraform.

electropolis commented 9 months ago

@devZer0

I created template without cloudinit image

image

I hashed the code in ansible responsible for creating the image

    # - name: Settings for Cloudinit
    #   tags: cloudinit
    #   block:

    #     - name: Set fact for cloudinit disk to check if exists
    #       ansible.builtin.set_fact:
    #         cloudinit_image: "{{ cloudinit_image | default([]) + [item] }}"
    #       loop: "{{ qm_config.results }}"
    #       when: not item.stdout is search('vm-{{ item.item.template_id}}-cloudinit')

    #     - name: CloudInit results
    #       ansible.builtin.debug:
    #         var: cloudinit_image
    #         verbosity: 2

    #     - name: Add cloud-init image as CDROM
    #       ansible.builtin.command: "qm set {{ item.item.template_id }} --ide3 local-lvm:cloudinit"
    #       loop: "{{ cloudinit_image }}"
    #       when: cloudinit_image is defined

When running terraform it sticks on Still creating And I Proxmox I observe this

image

no cloudinit drive only CD-ROM on ide2.

vasekhodina commented 9 months ago

In my experiments when I add this to my terraform code:

  ciuser   = "ciuser_name"
  cipassword = "<some_password>"
  cloudinit_cdrom_storage = "local-lvm"

AND

I use a VM template without a cloudinit drive for cloning.

Then I get a VM that has empty CD drive on IDE2 and cloud-init drive on IDE3. Of course this setup fails to boot as cloud-init drives are expected to be on IDE2. When I manually delete both IDE drives and create cloud-init drive on IDE2 it works.

No idea if there is a way to force this provider to create cloud-init drive on IDE2.

electropolis commented 9 months ago

In my experiments when I add this to my terraform code:

  ciuser   = "ciuser_name"
  cipassword = "<some_password>"
  cloudinit_cdrom_storage = "local-lvm"

AND

I use a VM template without a cloudinit drive for cloning.

Then I get a VM that has empty CD drive on IDE2 and cloud-init drive on IDE3. Of course this setup fails to boot as cloud-init drives are expected to be on IDE2. When I manually delete both IDE drives and create cloud-init drive on IDE2 it works.

No idea if there is a way to force this provider to create cloud-init drive on IDE2.

So definitely an issue. I'm using cicustom because I prefer to have cloudinit on VM with specific settings instead using limited ci<parameter> that doesn't have many other options to set it up. And that setup also doesn't work..
I think there is huge mess of information and now there are at least 3-4 issues that are describing that error. There are even some person who say that it works. I don't know how although they also have IDE2 with empty CD-ROM

vasekhodina commented 9 months ago

So definitely an issue. I'm using cicustom because I prefer to have cloudinit on VM with specific settings instead using limited ci<parameter> that doesn't have many other options to set it up. And that setup also doesn't work.. I think there is huge mess of information and now there are at least 3-4 issues that are describing that error. There are even some person who say that it works. I don't know how although they also have IDE2 with empty CD-ROM

I agree it's a mess now. I think those people that say it works didn't test if the machine boots. Just that the cloudinit drive is there was enough for them.

cwilliams001 commented 9 months ago

@vasekhodina the machines do in fact boot. If you follow the steps for the work around correctly you should be able to have a working template.

Create a template with cloud init.

After the template is made delete the cloud init drive from the template in the hardware settings of the web UI.

Regenerate the image for your template.

Clone your template

Start VM

If this doesn't work there is a mismatch of something in your terraform code. Something probably with the disk type expected and which one is actually booting.

If you jump in the discord there are others who have working configs and are more than happy to help, I agree with you that this is an issue. I disagree with your implications that people are just saying it works so they can write about it here.

electropolis commented 9 months ago

@vasekhodina the machines do in fact boot. If you follow the steps for the work around correctly you should be able to have a working template.

Create a template with cloud init.

After the template is made delete the cloud init drive from the template in the hardware settings of the web UI.

Regenerate the image for your template.

Clone your template

Start VM

If this doesn't work there is a mismatch of something in your terraform code. Something probably with the disk type expected and which one is actually booting.

If you jump in the discord there are others who have working configs and are more than happy to help, I agree with you that this is an issue. I disagree with your implications that people are just saying it works so they can write about it here.

What's the difference between having template with cloudinit from which you remove cloudinit drive or having template generated without cloudinit at start ?

And where is the discord group? Just remember that those whom the code isn't working saying that they have some issue/missmatch in terraform code sounds bit silly as this code was working with Proxmox v7 with no problems. So basically its not fault of terraform code but something that has been changed in Proxmox v8 and need to be adjusted to the code.

After the template is made delete the cloud init drive from the template in the hardware settings of the web UI. Regenerate the image for your template.

And this I don't understand. What needs to be Regenerated when you remove the cloudinit drive from template that has the cloudinit added during the process of preparing that image ? There is no option to Regenerate anything. You regenerate only modification of CloudInit options

image But as you see it's blurred because removing cloudinit drive from Hardware will end up with this so there is no Regenerate anything.. maybe you can clarify what do you mean? Or described in more clear manner?

When I removed the cloudinit drive and run terraform I only received empty CD-ROM image

And terraform code that stills creates the server

local_file.cloud_init_network-config_file[0]: Creating...
local_file.cloud_init_user_data_file[0]: Creating...
local_file.cloud_init_user_data_file[0]: Creation complete after 0s [id=3f117050086fe38745d5c895f3f44e599d3743ab]
local_file.cloud_init_network-config_file[0]: Creation complete after 0s [id=584397ad1c649775dfef64cff2484d781972e34c]
null_resource.cloud_init_network-config_files[0]: Creating...
null_resource.cloud_init_config_files[0]: Creating...
null_resource.cloud_init_network-config_files[0]: Provisioning with 'file'...
null_resource.cloud_init_config_files[0]: Provisioning with 'file'...
null_resource.cloud_init_config_files[0]: Creation complete after 0s [id=3296930188545092453]
null_resource.cloud_init_network-config_files[0]: Creation complete after 0s [id=7900105167386830163]
proxmox_vm_qemu.cloudinit["srv-app-1"]: Creating...
proxmox_vm_qemu.cloudinit["srv-app-1"]: Still creating... [10s elapsed]
proxmox_vm_qemu.cloudinit["srv-app-1"]: Still creating... [20s elapsed]
proxmox_vm_qemu.cloudinit["srv-app-1"]: Still creating... [30s elapsed]
proxmox_vm_qemu.cloudinit["srv-app-1"]: Still creating... [40s elapsed]
proxmox_vm_qemu.cloudinit["srv-app-1"]: Still creating... [50s elapsed]
proxmox_vm_qemu.cloudinit["srv-app-1"]: Still creating... [1m0s elapsed]
proxmox_vm_qemu.cloudinit["srv-app-1"]: Still creating... [1m10s elapsed]
proxmox_vm_qemu.cloudinit["srv-app-1"]: Still creating... [1m20s elapsed]
proxmox_vm_qemu.cloudinit["srv-app-1"]: Still creating... [1m30s elapsed]
proxmox_vm_qemu.cloudinit["srv-app-1"]: Still creating... [1m40s elapsed]
proxmox_vm_qemu.cloudinit["srv-app-1"]: Still creating... [1m50s elapsed]
proxmox_vm_qemu.cloudinit["srv-app-1"]: Still creating... [2m0s elapsed]
proxmox_vm_qemu.cloudinit["srv-app-1"]: Still creating... [2m10s elapsed]
proxmox_vm_qemu.cloudinit["srv-app-1"]: Still creating... [2m20s elapsed]
vasekhodina commented 9 months ago

@cwilliams001 Interesing. I didn't mean to say others just want write here it works and be done. Only that I think they might not have tested. I'm happy there should be a workaround.

Do you have an invide link to the discord server?

cwilliams001 commented 9 months ago

@vasekhodina @electropolis https://discord.gg/TJc4dMvF6k

@electropolis I mispoke I apologize I regenerate the cloud init image for nothing but good measure before I detach the cloud init drive from the template. You are also correct in this is an issue with pve version 8 and above because of changes they have made to the api. The community though as you'll see in the discord has really revived this project int he last few weeks and are working to fix bugs like these. It is by no means a perfect solution and I am a sample size of one but there are others who have been able to use the provider successfully.

My current working terraform file

resource "proxmox_vm_qemu" "kube-master" {
  count                   = var.server_count
  name                    = "kube-master-0${count.index + 1}"
  desc                    = "kube-master${count.index + 1}"
  vmid                    = var.server_vmid_begin + count.index + 1
  target_node             = var.proxmox_host
  onboot                  = true
  clone                   = var.template_name
  cloudinit_cdrom_storage = var.proxmox_storage
  full_clone              = true
  agent                   = 1
  os_type                 = "cloud-init"
  cores                   = var.server_cores
  sockets                 = 1
  cpu                     = "host"
  memory                  = var.server_memory
  scsihw                  = "virtio-scsi-pci"
  bootdisk                = "scsi0"
  disks {
    scsi {
      scsi0 {
        disk {
          size       = var.server_disk_size
          storage    = var.proxmox_storage
          emulatessd = true
        }
      }
    }
  }
  network {
    bridge = var.proxmox_network_device
    model  = "virtio"
  }

  ipconfig0 = "ip=10.70.7.10${count.index + 1}/16,gw=10.70.0.1"
  sshkeys   = <<EOF
    ${var.pub_ssh_key}
    EOF

  lifecycle {
    ignore_changes = [
      network,
      ciuser,
      qemu_os
    ]
  }

}
resource "proxmox_vm_qemu" "kube-worker" {
  count                   = var.client_count
  name                    = "kube-worker-0${count.index + 1}"
  desc                    = "kube-worker${count.index + 1}"
  vmid                    = var.client_vmid_begin + count.index + 1
  target_node             = var.proxmox_host
  onboot                  = true
  full_clone              = true
  clone                   = var.template_name
  cloudinit_cdrom_storage = var.proxmox_storage
  agent                   = 1
  os_type                 = "cloud-init"
  cores                   = var.client_cores
  sockets                 = 1
  cpu                     = "host"
  memory                  = var.client_memory
  scsihw                  = "virtio-scsi-pci"
  bootdisk                = "scsi0"
  disks {
    scsi {
      scsi0 {
        disk {
          size       = var.client_disk_size
          storage    = var.proxmox_storage
          emulatessd = true
        }
      }
    }
  }

  network {
    bridge = var.proxmox_network_device
    model  = "virtio"
  }

  ipconfig0 = "ip=10.70.8.10${count.index + 1}/16,gw=10.70.0.1"
  sshkeys   = <<EOF
    ${var.pub_ssh_key}
    EOF

  lifecycle {
    ignore_changes = [
      network,
      ciuser,
      qemu_os
    ]
  }
}

resource "local_file" "ansible_inventory" {
  content = templatefile("inventory.tmpl",
    {
      servers = tomap({
        for idx in range(var.server_count) :
        "kube-master-${format("%02d", idx + 1)}" => split("/", split("=", split(",", proxmox_vm_qemu.kube-master[idx].ipconfig0)[0])[1])[0]
      })
      clients = tomap({
        for idx in range(var.client_count) :
        "kube-worker-${format("%02d", idx + 1)}" => split("/", split("=", split(",", proxmox_vm_qemu.kube-worker[idx].ipconfig0)[0])[1])[0]
      })
    }
  )
  filename = "../inventory"
}

vm template

electropolis commented 9 months ago

@cwilliams001 For me in logical perspective creating a template without cloudinit or creating template with cloudinit and then remove it doesn't make sense at all. But I did it and the result is the same. No cloudinit drive only ide2 CD-ROM. All of you who wrote here are using those ciuser , cipassword and so on. I'm not using those parameters as they are not efficient. It requires additional settings using ansible to do a post-provisioning setup to configure those machine. I prefer to control this from cloudinit to prepare different image scenarios that allows inject network config, user creation, ssh-keys and so on and in case if some of those settings are different for different images or different images versions than I can create adjustment, change some thing and have for specific image and exception. Ansible in the other hand is a tool to configure as well but by adding some specific server functions that aren't common. For example: I wanna have from those servers prepared to next configuration stages as Kubernetes nodes with k3s I have k3s playbook for that. I wanna have PowerDNS on that server, sure I have playbook for that. But to get those playbooks up and running I need user, ssh keys, network set it up and so on and those static solutions are provided by cloudinit with many other additional settings that are always set but I don't want them to be set with Ansible as an additional step. Ansible here is just to fulfil the those servers with specific task that will give them a purpose.

So basically you guys didn't tested with cicustom using snippets that requires ssh to proxmox and put those snippets into /var/lib/vz/snippets I can't debug what is going.

Btw count = var.client_count using count is error prone. Each resource should have unique index in the state file. Rely on list index isn't worth it as it changes which will occur with constantly deleting instances because item in list have non static index number.

vasekhodina commented 9 months ago

Ok, so after I redefined by boot order the machine started. Thanks to @cwilliams001 for encouraging me to check my config again. I've actually found the solution before getting to discord.

So here's a summary of what to do, in case you're from the future and you ran into the same issue as I did.

  1. Create a template with cloud init.

  2. After the template is made delete the cloud init drive from the template in the hardware settings of the web UI.

  3. Regenerate the image for your template.

  4. Clone your template, etc. (We want to do this via Terraform.)

Basically you want to have a template with cloudinit image ready but without the cloud-init drive. If the cloud init drive was present it would just get deleted not created. You'd end up with empty IDE2 drive.

Next you want to make sure you have the following values in your Terraform code as arguments for "proxmox_vm_qemu" resource.

# These 3 vars are there so that cloud-init drive gets created
  ciuser                  = "ciuser_name"
  cipassword              = "<some_password>"
  cloudinit_cdrom_storage = "local-lvm"
# The following is for making sure that when the VM get's created it knows how to boot
  boot                    = "order=scsi0;ide3"

What happens is that when the drive is missing after cloning it will get added by Terraform as IDE3 CD drive alongside with empty IDE2 drive. That's why we need to define the bootorder to contain the IDE3 drive. Otherwise IDE2 might get used by default and the boot process might fail.

Of course if you have your boot disk connected via any other way that scsi, you need to change the first entry in boot order. My boot disk is connected via scsi.

electropolis commented 9 months ago

ok @vasekhodina you point some important and crucial things than none of commentators described

Basically you want to have a template with cloudinit image ready but without the cloud-init drive. If the cloud init drive was present it would just get deleted not created. You'd end up with empty IDE2 drive.

So it means you can have any cloudinit drive on ide2 but it doesn't mean you can't have it at all in the template unless it's configured on IDE2. Having cloudinit on IDE2 should work. I have tested both cases. WITH and WITHOUT none of them worked. But you mention different case further.

# These 3 vars are there so that cloud-init drive gets created
  ciuser                  = "ciuser_name"
  cipassword              = "<some_password>"
  cloudinit_cdrom_storage = "local-lvm"

Nah, I don't agree. Although cloudinit_cdrom_storage = "local-lvm" is crucial and some people don't have and some do. I asked that already here in some Issues, and author said that cloudinit_cdrom_storage should be always set. But those options are for static cloudinit settings when you want them to be set from GUI here

image

Which is described here https://pve.proxmox.com/wiki/Cloud-Init_Support#_deploying_cloud_init_templates but mine config doesn't rely on those settings as they are very limited. Instead I rely on this https://pve.proxmox.com/wiki/Cloud-Init_Support#_custom_cloud_init_configuration and that is why I'm using cicustom https://github.com/sonic-networks/terraform/blob/master/proxmox/sonic/main.tf#L32

It's more flexible and worked - in pve7. But as I said you mention interesting thing: boot = "order=scsi0;ide3" But this is only for booting to have cloudinit booted yes? But what about cloudinit that should use snippet ? :/

I tried that also and doesn't work. Nobody is using snippets. And I also realise that the cloudinit isn't attached at all. Can't say why.

vasekhodina commented 9 months ago

@electropolis Looks like the way you use proxmox is a lot different from mine. This around my 7th day of using proxmox and terraform so I'm not an expert and I'm glad I made the stuff work. So I can't help you, I'm sorry. Hope you have luck on the Discord channel.

electropolis commented 9 months ago

@electropolis Looks like the way you use proxmox is a lot different from mine. This around my 7th day of using proxmox and terraform so I'm not an expert and I'm glad I made the stuff work. So I can't help you, I'm sorry. Hope you have luck on the Discord channel.

How I can use proxmox different way? The goal is the same. But I want to have cicustom with real cloudinit like it is done in the cloud providers. That's an option that is provided by proxmox and can be used. On Discord there is the same solution without cicustom and some people say it's not working.

hestiahacker commented 8 months ago

I've confirmed that this still affects 3.0.1-rc1.

I've also confirmed that what @vasekhodina said about setting the boot order does work around this issue. I don't entirely understand why, but it causes the cloud-init drive to be attached to ide3 (which is available) instead of ide0 (which is occupied).

What @electropolis said about cloudinit_cdrom_storage being required is correct. This was mentioned in #935 and I submitted a merge request to get the documentation updated: https://github.com/Telmate/terraform-provider-proxmox/pull/939

To @electropolis's question about whether cloud-init works in this provider without cicustom, I can attest that it does. Below are the relevant settings that I use to set a static IP address and specify nameservers on my machines. If you're terraform looks similar and you're still having trouble, please post a minimal terraform which has the issue and I'll use that to try to reproduce the issue on my side.

  # Cloud Init Settings
  cloudinit_cdrom_storage = var.storage_backend
  # Reference: https://pve.proxmox.com/wiki/Cloud-Init_Support
  ipconfig0 = "ip=${var.ip_address}/${var.cidr},gw=${var.gateway}"
  nameserver = var.nameservers
  sshkeys = var.sshkeys
  boot = "order=virtio0;ide3"

While I'm currently testing with 3.0.1-rc1 of the provider, I had cloud init working with 2.9.11 as well (without using cicustom).

electropolis commented 8 months ago

@hestiahacker but it doesn't work with cicustom which means there is a bug still.

hestiahacker commented 8 months ago

Yeah, you're right. I thought you were trying to use cicustom to work around this issue, but now that I've re-read the thread more closely I see that's not the case. Also, even if you were just trying to work around something the fact that cicustom doesn't work is still a problem, so I'm not sure what I was thinking. :facepalm:

I went through the other 5 open tickets that mention cicustom and it seemed to only be mentioned in passing or in examples. I didn't see a ticket just saying "cicustom doesn't work" which seems to be the case here. If you'd be willing to make a new ticket about this and tag me in a comment right away so I don't miss it, I'll put this in my queue of things to look into.

I tried using it something like a year ago and couldn't get it to work, but I thought it was just because I didn't know what I was doing. I'd probably also use cicustom if I better understood what it can do, and the more people using it, the more likely we are to find and fix these issues quickly.

electropolis commented 8 months ago

Yeah, you're right. I thought you were trying to use cicustom to work around this issue, but now that I've re-read the thread more closely I see that's not the case. Also, even if you were just trying to work around something the fact that cicustom doesn't work is still a problem, so I'm not sure what I was thinking. 🤦

I went through the other 5 open tickets that mention cicustom and it seemed to only be mentioned in passing or in examples. I didn't see a ticket just saying "cicustom doesn't work" which seems to be the case here. If you'd be willing to make a new ticket about this and tag me in a comment right away so I don't miss it, I'll put this in my queue of things to look into.

I tried using it something like a year ago and couldn't get it to work, but I thought it was just because I didn't know what I was doing. I'd probably also use cicustom if I better understood what it can do, and the more people using it, the more likely we are to find and fix these issues quickly.

I could create a new issue but its still copies everything again with description. I could do that but today in the evening.

hestiahacker commented 8 months ago

You can just make the new issue short and sweet and then link to this ticket. All I need is a minimal terraform file that demonstrates the issue and the expected/actual behavior.

I just posted a minimal example that would make an idea starting place for me. Just change the Cloud Init Settings and I should be able to test it quickly and easily. Just to set expectations, it'll probably be next Monday until I get to it. Mondays are my "work on the proxmox provider" days.

Tinyblargon commented 8 months ago

This should be resolved in the latest build #959 has an example.

adamgass commented 7 months ago

@hestiahacker I have been having issues with getting cicustom to work similar to what others have mentioned and a simple workaround I have found is to specify a ciuser or nameserver within the TF resource block. Having something like ciuser="default" lets the cloud-init drive mount to the vm and cloud-init is off and running. I think it just needs some kind of input parameter for the drive to function properly. It's almost as if when a vm is cloned there is some type of logic written during the mount process that checks for input and if nothing is there then it discards the drive.

electropolis commented 7 months ago

@hestiahacker I have been having issues with getting cicustom to work similar to what others have mentioned and a simple workaround I have found is to specify a ciuser or nameserver within the TF resource block. Having something like ciuser="default" lets the cloud-init drive mount to the vm and cloud-init is off and running. I think it just needs some kind of input parameter for the drive to function properly. It's almost as if when a vm is cloned there is some type of logic written during the mount process that checks for input and if nothing is there then it discards the drive.

If you wanna use cicustom above your comment Tinyblargon provided the link. Just follow it. There is nothing to check, try and so on. The solutions above describes how to compile new provider with changed code that omits the cdrom and it doesn't disappear.