Telmate / terraform-provider-proxmox

Terraform provider plugin for proxmox
MIT License
2.1k stars 509 forks source link

Feedback: `disk` schema #986

Closed Tinyblargon closed 6 days ago

Tinyblargon commented 4 months ago

Over the past month we have received quite a few comments that the new disks schema doesn't work when creating a module. The reason the disks schema was created is because the old disk schema didn't reflect how Proxmox actually handles disks and would show weird Terraform diffs. I have been thinking about bringing back the disk schema to have better support for modules. To better tend to the community's needs I'm requesting for feedback on this new disk schema.

The new disk schame can be found in in: https://github.com/Telmate/terraform-provider-proxmox/tree/new-disk

Tinyblargon commented 4 months ago

https://github.com/Telmate/terraform-provider-proxmox/issues/907 https://github.com/Telmate/terraform-provider-proxmox/issues/969

myplixa commented 4 months ago

Firstly, thank you, @Tinyblargon and the rest of the team for seeing and listening to a community that uses this provider quite often as part of universal modules.

I will try to build this code and test it, next week. and I'm a little confused as to what the disk layout looks like now.

  disk {
    slot = virtio0
    size = 10
    storage = local-zfs
    type = ????
}

Can you provide an example?

P/S Also, it would be nice to add the ability to use the "import_from" parameter, which allows you not to clone a template but to create a new vm and import the finished disk image from external or internal storage. Pull Request already offered such an implementation. Couldn't find its number. In another provider "bgp" it's implemented like this.

  disk {
    datastore_id = local.datastore_id
    file_id      = proxmox_virtual_environment_download_file.latest_debian_12_bookworm_qcow2_img.id
    interface    = "scsi0"
    discard      = "on"
    cache        = "writeback"
    ssd          = true
  }
Tinyblargon commented 3 months ago

@myplixa in your case type would be type = disk due to all the settings for disk and cdrom being in a single item we have to specify which kind of disk it is creating.

myplixa commented 3 months ago

@Tinyblargon Thank you very much for the answer, to my question. Alas, I am now very busy with work tasks and do not have time to build and test this implementation proposed by you. I will try to start this work in the next couple of weeks and give an answer on the current scheme and provide examples of implementation of work with dynamic disks for modules.

mathieuHa commented 3 months ago

Hi @Tinyblargon, Thanks for you work on this provider and the new disk proposal.
I was also using in v2.X a module on top of this provider to default variables and lighten configurations.

I believe this new schema would be much more easy to work with for a terraform/opentofu user to write modules. From what I understood we now have disks configurable with just one depth:

"disk": {
                Type:          schema.TypeList,
                Optional:      true,
                ConflictsWith: []string{"disks"},
                Elem: &schema.Resource{
                    Schema: map[string]*schema.Schema{
                        "asyncio":              schema_DiskAsyncIO(),
                        "backup":               schema_DiskBackup(),
                        "cache":                schema_DiskCache(),
                        "discard":              {Type: schema.TypeBool, Optional: true},
                        "disk_file":            schema_PassthroughFile(schema.Schema{Optional: true}),
                        "emulatessd":           {Type: schema.TypeBool, Optional: true},
                        "format":               schema_DiskFormat(),
                        "id":                   schema_DiskId(),
                        "iops_r_burst":         schema_DiskBandwidthIopsBurst(),
                        "iops_r_burst_length":  schema_DiskBandwidthIopsBurstLength(),
                        "iops_r_concurrent":    schema_DiskBandwidthIopsConcurrent(),
                        "iops_wr_burst":        schema_DiskBandwidthIopsBurst(),
                        "iops_wr_burst_length": schema_DiskBandwidthIopsBurstLength(),
                        "iops_wr_concurrent":   schema_DiskBandwidthIopsConcurrent(),
                        "iothread":             {Type: schema.TypeBool, Optional: true},
                        "iso":                  schema_IsoPath(schema.Schema{}),
                        "linked_disk_id":       schema_LinkedDiskId(),
                        "mbps_r_burst":         schema_DiskBandwidthMBpsBurst(),
                        "mbps_r_concurrent":    schema_DiskBandwidthMBpsConcurrent(),
                        "mbps_wr_burst":        schema_DiskBandwidthMBpsBurst(),
                        "mbps_wr_concurrent":   schema_DiskBandwidthMBpsConcurrent(),
                        "passthrough":          {Type: schema.TypeBool, Default: false, Optional: true},
                        "readonly":             {Type: schema.TypeBool, Optional: true},
                        "replicate":            {Type: schema.TypeBool, Optional: true},
                        "serial":               schema_DiskSerial(),
                        "size":                 schema_DiskSize(schema.Schema{Computed: true, Optional: true}),
                        "slot": {
                            Type:     schema.TypeString,
                            Required: true,
                            ValidateDiagFunc: func(i interface{}, k cty.Path) diag.Diagnostics {
                                v, ok := i.(string)
                                if !ok {
                                    return diag.Errorf(errorString, k)
                                }
                                switch v {
                                case "ide0", "ide1", "ide2",
                                    "sata0", "sata1", "sata2", "sata3", "sata4", "sata5",
                                    "scsi0", "scsi1", "scsi2", "scsi3", "scsi4", "scsi5", "scsi6", "scsi7", "scsi8", "scsi9", "scsi10", "scsi11", "scsi12", "scsi13", "scsi14", "scsi15", "scsi16", "scsi17", "scsi18", "scsi19", "scsi20", "scsi21", "scsi22", "scsi23", "scsi24", "scsi25", "scsi26", "scsi27", "scsi28", "scsi29", "scsi30",
                                    "virtio0", "virtio1", "virtio2", "virtio3", "virtio4", "virtio5", "virtio6", "virtio7", "virtio8", "virtio9", "virtio10", "virtio11", "virtio12", "virtio13", "virtio14", "virtio15":
                                    return nil
                                }
                                return diag.Errorf("slot must be one of 'ide0', 'ide1', 'ide2', 'sata0', 'sata1', 'sata2', 'sata3', 'sata4', 'sata5', 'scsi0', 'scsi1', 'scsi2', 'scsi3', 'scsi4', 'scsi5', 'scsi6', 'scsi7', 'scsi8', 'scsi9', 'scsi10', 'scsi11', 'scsi12', 'scsi13', 'scsi14', 'scsi15', 'scsi16', 'scsi17', 'scsi18', 'scsi19', 'scsi20', 'scsi21', 'scsi22', 'scsi23', 'scsi24', 'scsi25', 'scsi26', 'scsi27', 'scsi28', 'scsi29', 'scsi30', 'virtio0', 'virtio1', 'virtio2', 'virtio3', 'virtio4', 'virtio5', 'virtio6', 'virtio7', 'virtio8', 'virtio9', 'virtio10', 'virtio11', 'virtio12', 'virtio13', 'virtio14', 'virtio15'")
                            },
                        },
                        "storage": schema_DiskStorage(schema.Schema{Optional: true}),
                        "type": {
                            Type:     schema.TypeString,
                            Optional: true,
                            Default:  "disk",
                            ValidateDiagFunc: func(i interface{}, k cty.Path) diag.Diagnostics {
                                v, ok := i.(string)
                                if !ok {
                                    return diag.Errorf(errorString, k)
                                }
                                switch v {
                                case "disk", "cdrom":
                                    return nil
                                }
                                return diag.Errorf("type must be one of 'disk' or 'cdrom'")
                            },
                        },
                        "wwn": schema_DiskWWN(),
                    },
                },
            },

We can then choose which type of disk (virtio, scsi,ide..) with slot variable.

I believe we could write modules almost like in the old version: module/main.tf

  dynamic "disk" {
    for_each = var.disks
    content {
      slot     = "disk.value.slot"
      size     = disk.value.size
      storage  = disk.value.storage
      emulatessd      = disk.value.emulatessd
      iothread = disk.value.iothread
      discard = disk.value.discard
    }
  }
...

module/variables.tf

variable "disks" {
  type = list(object({ 
    slot       = string    
    size       = string
    storage    = string
    emulatessd = bool
    iothread   = bool
    discard    = bool
  }))
  default = [{
    slot       = "scsi0"
    size       = "10G"
    storage    = "local-lvm"
    emulatessd = true
    iothread   = true
    discard    = true
  }]
}
...

vm1/main.tf

disks = [
  {
    slot       = "scsi0"
    iothread = 0
    size     = "20G"
    storage  = "ssd1"
  },
  {
    slot       = "scsi1"
    iothread = 0
    size     = "180G"
    storage  = "ssd2"
  }
]

I'm not sure how I could really test the version from the branch new-disk as using

terraform {
  required_providers {
    proxmox = {
      source  = "Telmate/proxmox"
      version = "new-disk"
    }
  }
}
terraform {
  required_providers {
    proxmox = {
      source  = "git::https://github.com/Telmate/terraform-provider-proxmox.git?ref=new-disk"
    }
  }
}

returns warnings

This string does not use correct version constraint syntax. The "source" attribute must be in the format "[hostname/][namespace/]name"

However, to sum up, I really like the schematics proposed for the disks.

gbrackley commented 3 months ago

@Tinyblargon Thank you for putting forward a new design; and thank you to @mathieuHa for decomposing what has been proposed.

I had a look at the branch and reverse engineered what has been proposed/implemented. I didn't find any documentation/information that describes the design/strategy. My best concise guess is:

The new design continues to provide a nested typed data structure for direct use of the provider, while at the same time provides duck typed array disk structure that allows disks to be defined using a common type that is translated at runtime to the original nested typed data structure. The compatibility disk array can be used by module authors.

I haven't compiled the branch and tested it sorry.

IMHO

The compatibility layer would make the provider functionally usable (by me). I could and would live with it, but it seems to be a very small evolution.

The more I reflect on this provider the more I think it makes sense to break the various resources up into child terraform resources. Those child resources can be highly typed, specific to the purpose, easy to document, and easy to understand due to those constraints (like the new nested disks structure). This takes the provider from being largely a single resource (e.g. the QEMU VM), to a small collection of resources to achieve an outcome. This might be a little bit more boiler plate coding. Each disk could be a first class terraform resource, then the disks in the main VM resource could reference those disks by name.

The main QEMU VM resource is IMO getting too big/complex. I have quite a constrained use case deploying 'cattle' VMs (typically Flatcar). I typically delete VMs before reprovisioning them from scratch. Sometimes I will hack a bit of storage (i.e. a disk) into the VM that is outside of the proxmox lifecycle (I have a single host, and understand this stops the machine being migrated to another host). If we could get this provider and proxmox to the point that disks could have a lifetime that is longer than the VM then that would be great (but outside the scope of this feedback).

gbrackley commented 3 months ago

Another item I forgot to mention was pulling the mbps_* and iops_* fields out of the disk structure into a child (data?) resource. For the sake of DRY (don't repeat yourself), a single rate limiting resource could then be used on multiple disks. Having such a big list of disk attributes just makes using the resource harder.

Tinyblargon commented 3 months ago

@gbrackley Your referring to is what Terraform calls Local-only Data Sources. This can be implemented in a non breaking way with the current schema. We can look into this in the future since this would require a lot of extra work.

maksimsamt commented 3 months ago

Looks like this could help - https://github.com/Telmate/terraform-provider-proxmox/issues/907#issuecomment-2138076259 , https://github.com/Telmate/terraform-provider-proxmox/issues/907#issuecomment-2162765751

h2odriving commented 2 months ago

@Tinyblargon thanks for the work on the provider! I tested the proposed schema (@mathieuHa thanks for your decomposing the new schema information, that helped a lot) in a small environment and it worked for the disk configuration pretty well.

It would be great if those changes will make the transition to the 3.0.1 release. I'm aware that there is a another proposal how to solve this mentioned by @maksimsamt, but in my opinion the disk schema here would allow cleaner module definitions without defining everything.

One topic missing in the new schema definitions is the cloud-init disk. It would be great if the cloud-init disk is included in the new disk schema so it is defined in a similar way.

The most important part for me would be that there is a final disk schema which is then hopefully not changing in the near future again. At the moment we are waiting for the final decision regarding the disk schema before we start refactoring all of our terraform running on the old 2.9.14 Version.

This means if there will be the decision to work with the solution by @maksimsamt this would be fine as well, but would mean for us a lot of refactoring of our terraform modules.

Looking forward on the final decision.

Tinyblargon commented 2 months ago

@h2odriving currently I'm working on: https://github.com/Telmate/proxmox-api-go/pull/339 once that has concluded I'm gonna start workin on the disk implementation.

maksimsamt commented 2 months ago

The most important part for me would be that there is a final disk schema which is then hopefully not changing in the near future again.

Totally agree with @h2odriving, this is a very important point to decide whether to use the current version or not.

@Tinyblargon, Will the current disk schema (3.0.1-rc3) remains the same or will there be changes in next releases?

Tinyblargon commented 2 months ago

@maksimsamt after the disk schema has been implemented, I'm not planning on changing it anymore as now the way Terraform, Proxmox-SDK and Proxmox Virtual Environment treat disks in the same way. The implementation will slightly differ from what was proposed here due to the cloud-init drive bieng merged into this scheme. There is a feature request for supporting iscsi passthrough with the disks, but that should be compatible with the current schema. Also, this iscsi feature is in the backlog and might never be implemented.

Tinyblargon commented 2 months ago

Also, as we do not want a situation like this ever again, most new features will get their own release candidates before being officially released.

maksimsamt commented 2 months ago

@Tinyblargon,

This is a little unclear:

The implementation will slightly differ from what was proposed here due to the cloud-init drive bieng merged into this scheme

Tinyblargon commented 2 months ago

@maksimsamt the cloud-init disk will be merged in this schema, which has already happened to the disks schema in master. I'll update the draft in the coming days. It would only be a minor change.

hestiahacker commented 2 months ago

@Tinyblargon I appreciate you taking feedback from the community before making an official release and taking care to make sure we don't change schemas multiple times in official releases.

Back when 3.0.1-rc1 was released, I updated all my modules to use the disks schema and they've been working fine. Normally I wouldn't update to a release candidate, but there are critical fixes that are not in any release, so I had to switch to either a release candidate or the HEAD of the default branch, which would be even more of a moving target.

I think the comments from people saying that they can't update their modules to use the disks schema are now outdated since maksimsamt has demonstrated that it is possible, even with complex, dynamic configurations, and has provided multiple implementations.

It looks like these examples accommodate the comments from people who were having trouble with the new schema. We had one person confirm it works for them, and nobody has said the example modules do not work for them in terraform.

So I hope that the provider doesn't revert back to the old, reportedly unreliable disk schema or change to some new disk schema. In the case of either of those changes, it would require everyone who updated their modules and VMs to update them again. And it's not just me who updated their modules to use disks over the past few months.

People have already made strong argument for the disks schema better matching the actual implementation in qemu and it being more flexible than the disk schema

Tinyblargon commented 2 months ago

@hestiahacker The new disks schema is here to stay. However, we can have disk as well. disks and disk will be mutually exclusive. disks is pretty much done, and disk has 60+% of its implementation done. What is a bigger concern for me now is the amount of other things that don't behave like they should. For these remaining issues, there are only 3 options:

  1. Postpone the 3.0 release. This might take a year, so that's a bad idea.
  2. Remove features from 3.0 and slowly introduce them back in feature releases.
  3. Only do minimal updates on 3.0 and start with 4.0

I want to deliver a stable Terraform provider, but so many things need to be reworked.

hestiahacker commented 2 months ago

Thanks for the clarification and I'm glad to hear disks is going to stay! :heart:

Yeah, those option are tough. :slightly_frowning_face: Unless option 2 is a lot less work than I'd imagine it'd be, I think option 3 looks like the least bad option.

maksimsamt commented 2 months ago

vote for:

  1. Only do minimal updates on 3.0 and start with 4.0
Tinyblargon commented 2 months ago

@hestiahacker @maksimsamt

It's all of LXC. And about half of QEMU, serial, network adapters, usb, ha, pcie.

The difficulty part is going to keep everything backwards compatible with Terraform as things change in the upstream project we use to connect with Proxmox. Current behaviors don't have tests, so this project is going to feel some of the repercussions. It's probably easiest to give every update after 3.0 a release candidate to let the community help with ensuring compatibility before we break things.

Option 3 it's probably going to be.

Sorry this post is a bit of a rant.

arsensimonyanpicsart commented 1 month ago

Guys, how you are testing new-disk branch ? can you please provide terraform provider configuration ?

terraform {
  required_providers {
    proxmox = {
      source  = "Telmate/proxmox"
      version = "new-disk"
    }
  }
}
terraform {
  required_providers {
    proxmox = {
      source  = "git::https://github.com/Telmate/terraform-provider-proxmox.git?ref=new-disk"
    }
  }
}

i tried this but not work

trfore commented 1 month ago

@arsensimonyanpicsart, you will need to have go installed. Note: I decided to call this version 3.0.1-newdisk, this is not the official name.

# clone the repo
git clone https://github.com/Telmate/terraform-provider-proxmox
cd terraform-provider-proxmox
# change the branch
git checkout new-disk

# build the binary
make

Move the binary into Terraform's plugins directory, the following directions are for Unix. If you are using windows the plugins path is %APPDATA%\terraform.d\plugins

# move the binary into the terraform plugins directory
VERSION='3.0.1-newdisk'

mkdir -p ~/.terraform.d/plugins/registry.terraform.io/telmate/proxmox/"${VERSION}"/linux_amd64/

cp bin/terraform-provider-proxmox ~/.terraform.d/plugins/registry.terraform.io/telmate/proxmox/"${VERSION}"/linux_amd64/terraform-provider-proxmox_v"${VERSION}"

Change your terraform file:

terraform {
  required_providers {
    proxmox = {
      source  = "Telmate/proxmox"
      version = "3.0.1-newdisk"
    }
  }
}

This will create a symlink in ~/.terraform.d/plugin-cache/registry.terraform.io/telmate/proxmox

➜ tree ~/.terraform.d/plugin-cache/registry.terraform.io/telmate/proxmox/
/home/$USER/.terraform.d/plugin-cache/registry.terraform.io/telmate/proxmox/
...
├── 3.0.1-newdisk
│   └── linux_amd64 -> /home/$USER/.terraform.d/plugins/registry.terraform.io/telmate/proxmox/3.0.1-newdisk/linux_amd64
...
arsensimonyanpicsart commented 1 month ago

thanks @trfore , i was able to test it.