clincha-org / clincha

Configuration and monitoring of clinch-home infrastructure
https://clinch-home.com
1 stars 1 forks source link

terraform-apply - disk already exists #90

Closed clincha closed 1 year ago

clincha commented 1 year ago

After running apply > destroy > apply the following message appears in Proxmox.

create full clone of drive ide0 (Hot:vm-101-cloudinit)
trying to acquire cfs lock 'storage-Hot' ...
trying to acquire cfs lock 'storage-Hot' ...
rbd: create error: 2023-06-05T09:32:53.096+0100 7f4f07aac340 -1 librbd: rbd image vm-103-cloudinit already exists(17) File exists
TASK ERROR: clone failed: rbd create 'vm-103-cloudinit' error: rbd: create error: 2023-06-05T09:32:53.096+0100 7f4f07aac340 -1 librbd: rbd image vm-103-cloudinit already exists(17) File exists

Going to try destroy the disks during terraform destroy

clincha commented 1 year ago

I think I got around the above issue by including the disk into the Terraform code. However, I've now hit this issue which must be from the new disks interfering

│ Error: scsi0 - cloud-init drive is already attached at 'ide0'
│ 
│   with module.bri-master-1.proxmox_vm_qemu.rhel8,
│   on modules/rhel8/main.tf line 1, in resource "proxmox_vm_qemu" "rhel8":
│    1: resource "proxmox_vm_qemu" "rhel8" {
clincha commented 1 year ago

Checking the Proxmox host logs to see if there are any issues I can see there.

less /var/log/syslog

Jun  6 19:08:52 bri-s-01 pvedaemon[53144]: <terraform@pam!terraform> update VM 103: -agent 1 -bios seabios -cores 4 -cpu host -description Kubernetes master node in Bristol (node 1) -hotplug network,disk,cpu,memory,usb -ipconfig0 ip=192.168.1.20/24,gw=192.168.1.1 -kvm 1 -memory 4096 -name bri-master-1 -net0 virtio=56:9A:25:90:13:B0,bridge=vmbr0 -numa 1 -onboot 1 -scsi0 Hot:vm-103-cloudinit,size=32G -scsihw virtio-scsi-pci -sockets 1 -tablet 1 -tags base,kubernetes_worker,kubernetes_master
Jun  6 19:08:52 bri-s-01 pvedaemon[53144]: <terraform@pam!terraform> starting task UPID:bri-s-01:0000D3A9:048D1A92:647F7634:qmconfig:103:terraform@pam!terraform:
Jun  6 19:08:52 bri-s-01 pvedaemon[54185]: VM 103 creating disks failed
Jun  6 19:08:52 bri-s-01 pvedaemon[54185]: scsi0 - cloud-init drive is already attached at 'ide0'
Jun  6 19:08:52 bri-s-01 pvedaemon[53144]: <terraform@pam!terraform> end task UPID:bri-s-01:0000D3A9:048D1A92:647F7634:qmconfig:103:terraform@pam!terraform: scsi0 - cloud-init drive is already attached at 'ide0'
clincha commented 1 year ago

A successful configuration looks like this in the logs:

Jun  6 18:43:09 bri-s-01 pvedaemon[3350754]: <terraform@pam!terraform> update VM 104: -agent 1 -bios seabios -cores 4 -cpu host -description Kubernetes worker node in Bristol (node 1) -hotplug network,disk,cpu,memory,usb -ipconfig0 ip=192.168.1.21/24,gw=192.168.1.1 -kvm 1 -memory 8192 -name bri-kubeworker-1 -net0 virtio=96:E2:4B:95:52:F2,bridge=vmbr0 -numa 1 -onboot 1 -scsi0 Hot:vm-104-disk-0,size=32G -scsihw virtio-scsi-pci -sockets 1 -tablet 1 -tags base,kubernetes_worker

However, an unsuccessful configuration looks like this:

Jun  6 19:18:47 bri-s-01 pvedaemon[43296]: <terraform@pam!terraform> update VM 104: -agent 1 -bios seabios -cores 4 -cpu host -description Kubernetes master node in Bristol (node 1) -hotplug network,disk,cpu,memory,usb -ipconfig0 ip=192.168.1.20/24,gw=192.168.1.1 -kvm 1 -memory 8192 -name bri-master-1 -net0 virtio=62:9C:DD:DF:43:92,bridge=vmbr0 -numa 1 -onboot 1 -scsi0 Hot:vm-104-cloudinit,size=32G -scsihw virtio-scsi-pci -sockets 1 -tablet 1 -tags base,kubernetes_worker,kubernetes_master

Often followed by these errors:

Jun  6 19:18:47 bri-s-01 pvedaemon[59246]: VM 104 creating disks failed
Jun  6 19:18:47 bri-s-01 pvedaemon[59246]: scsi0 - cloud-init drive is already attached at 'ide0'

I believe the important difference is that failed configurations are trying to set the scsi0 flag as -scsi0 Hot:vm-103-cloudinit,size=32G while successful configurations are trying to set the scsi0 flag as -scsi0 Hot:vm-100-disk-0,size=32G. I think there is a clash between the cloud init drive being mounted and the disk that I've been specifying. I've read through the Terraform provider and the Proxmox documentation.

I'm going to try and set the file, media and volume parameters one at a time and see if any of them help.

clincha commented 1 year ago

The terraform-destroy is working really well now, even when the apply has an error.

clincha commented 1 year ago

I deleted the cloud init drive on the underlying template. Everything passed the configured stage but then there was no cloud init configuration I could apply to it. The GUI looked like this:

image

clincha commented 1 year ago

I found this issue on GitHub. They have the same as this I think but no resolution :(

clincha commented 1 year ago

It seems to be working after the changes to the Packer template. Not sure which specific change fixed it.

I did change the scsi controller which probably fixed it

variable "scsi_controller" {
  type        = string
  default     = "virtio-scsi-single"
  description = "The SCSI controller model to emulate"
}