Closed stgraber closed 1 month ago
This key could be probably added here: https://github.com/lxc/terraform-provider-incus/blob/main/internal/storage/resource_storage_pool.go#L337-L372
Isn't that the list of computed pool keys as opposed to inherited keys on the volumes and buckets?
Isn't that the list of computed pool keys as opposed to inherited keys on the volumes and buckets?
In other words, you first created the pool without Terraform/OpenTofu and then imported it with Terraform/OpenTofu?
How can I reproduce your setup to get the same issue?
incus storage pool create foo zfs volume.zfs.remove_snapshots=true`
And then have Terraform create a new storage volume using that foo
storage pool.
Does this same behavior happen if the pool is created with a tf resource?
Good question, though not really relevant as the case I was dealing with was one where Incus is acting as a cloud, so the TF user doesn't have any control on the storage pools, they get an empty project and can only create project-tied resources.
stgraber@castiana:~/demo/test$ tofu apply
OpenTofu used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
OpenTofu will perform the following actions:
# incus_storage_pool.test will be created
+ resource "incus_storage_pool" "test" {
+ config = {
+ "volume.zfs.remove_snapshots" = "true"
}
+ driver = "zfs"
+ name = "test"
}
# incus_volume.test will be created
+ resource "incus_volume" "test" {
+ config = {
+ "size" = "20GiB"
}
+ content_type = "block"
+ location = (known after apply)
+ name = "test"
+ pool = "test"
+ target = (known after apply)
+ type = "custom"
}
Plan: 2 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?
OpenTofu will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
incus_storage_pool.test: Creating...
incus_storage_pool.test: Creation complete after 1s [name=test]
incus_volume.test: Creating...
╷
│ Error: Provider produced inconsistent result after apply
│
│ When applying changes to incus_volume.test, provider "provider[\"registry.opentofu.org/lxc/incus\"]" produced an unexpected new value: .config: new element "zfs.remove_snapshots" has appeared.
│
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵
stgraber@castiana:~/demo/test$ cat main.tf
terraform {
required_providers {
incus = {
source = "lxc/incus"
version = "0.1.1"
}
}
}
resource "incus_storage_pool" "test" {
name = "test"
driver = "zfs"
config = {
"volume.zfs.remove_snapshots" = "true"
}
}
resource "incus_volume" "test" {
name = "test"
pool = incus_storage_pool.test.name
content_type = "block"
config = {
"size" = "20GiB"
}
}
stgraber@castiana:~/demo/test$
So yeah, even if created by TF, it still happens.
Yeah I suspected it was the case either way. Inherited resources should be ignored completely since they technically come from another resource. Can they be excluded or identified in the api?
where Incus is acting as a cloud, so the TF user doesn't have any control on the storage pools, they get an empty project and can only create project-tied resources.
If we ever need to represent unmanaged resources, I think the idiomatic solution is to model Data Sources that can still be represented on the TF side.
The provider can do GetStoragePool then look at the Config map for a anything that's volume.XYZ then for any such key, XYZ should be treated as inherited in the volume or bucket if its value matches that of the pool volume.XYZ
@stgraber Regarding the storage pool volume, this PR fixes the problem that the volume takes into account the inherited configuration value: https://github.com/lxc/terraform-provider-incus/pull/66
Regarding the creation of storage pools, however, I am not sure whether a correction is really necessary here:
Test Case
Create the storage pool without Terraform, e.g.:
$ incus storage create test zfs volume.zfs.remove_snapshots=true
Define the resource in Terraform and import it:
resource "incus_storage_pool" "test" {
name = "test"
project = "default"
driver = "zfs"
}
$ terraform import incus_storage_pool.test default/test
incus_storage_pool.test: Importing from ID "default/test"...
incus_storage_pool.test: Import prepared!
Prepared incus_storage_pool for import
incus_storage_pool.test: Refreshing state... [name=test]
Import successful!
The resources that were imported are shown above. These resources are now in
your Terraform state and will henceforth be managed by Terraform.
$ terraform plan
incus_storage_pool.test: Refreshing state... [name=test]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the
following symbols:
~ update in-place
Terraform will perform the following actions:
# incus_storage_pool.test will be updated in-place
~ resource "incus_storage_pool" "test" {
~ config = {
- "volume.zfs.remove_snapshots" = "true" -> null
}
name = "test"
# (3 unchanged attributes hidden)
}
Plan: 0 to add, 1 to change, 0 to destroy.
So when we created the pool, the configuration was imported correctly, as we can see in the status file:
$ cat terraform.tfstate
{
"version": 4,
"terraform_version": "1.8.3",
"serial": 62,
"lineage": "863f8d8e-b4f8-b4cc-3bbe-5a247ae7e02e",
"outputs": {},
"resources": [
{
"mode": "managed",
"type": "incus_storage_pool",
"name": "test",
"provider": "provider[\"registry.terraform.io/lxc/incus\"]",
"instances": [
{
"schema_version": 0,
"attributes": {
"config": {
"volume.zfs.remove_snapshots": "true"
},
"description": "",
"driver": "zfs",
"name": "test",
"project": "default",
"remote": null,
"target": null
},
"sensitive_attributes": []
}
]
}
],
"check_results": null
}
But when we run terraform plan
again, the local configuration state is different from the remote state. This means that the resource definition is missing the corresponding configuration entry and therefore wants to force the state change. To rectify this, you must add the corresponding configuration entry to the resource definition:
resource "incus_storage_pool" "test" {
name = "test"
project = "default"
driver = "zfs"
config = {
"volume.zfs.remove_snapshots" = "true"
}
}
$ terraform plan
incus_storage_pool.test: Refreshing state... [name=test]
No changes. Your infrastructure matches the configuration.
Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are
needed.
So we could argue that this behavior is correct if you want to use the resource incus_storage_pool
. If you want to ignore this completely, then we should rather introduce a data source for incus_storage_pool
like we did for incus_profile
that allows you to read the status but not change anything. How do you see this?
The behavior for the storage pool is fine, the problem is really with storage pool keys propagating to volumes. I don't think just piling up "computed" keys is correct here.
One can create a volume and specifically set say block.filesystem=ext4
during creation time.
But someone can similarly create a volume without specifying any block.filesystem
but because the pool the volume is created in has volume.block.filesystem
set to xfs
, the resulting volume will now have Incus set block.filesystem=xfs
unless another value is specified.
This is true of ALL volume keys, possibly including arbitrary user ones (not tested), so a storage pool having volume.user.foo=abc
should result in a volume being created on that pool to get user.foo=abc
.
So if going with marking keys computed, every single storage volume config key would need to be treated as computed which doesn't seem right.
Your example with block.filesytem
is currently handled as a computed key: https://github.com/lxc/terraform-provider-incus/blob/main/internal/storage/resource_storage_volume.go#L382 as we have inherited when we forked the provider. Or am I wrong?
Bad example, but we have 20+ other keys which aren't covered by the hardcoded list, as mentioned it's effectively every single volume config key, which is why treating them all as computed kinda feels wrong.
Just looking at one storage driver (ZFS):
I see and funnily we have implemented many of those keys example already in the computed keys for the storage pool based on the driver: https://github.com/lxc/terraform-provider-incus/blob/d570e743a8285b22bbe5779844960a7254c89de1/internal/storage/resource_storage_pool.go#L337
So if I understand you correctly then you want to avoid that we hard code these keys but rather make a lookup first to Incus to determine which keys are already set and ignore them? Do you don’t see an issue that we might end up in a state situation where actually the storage volume to resource cannot be trusted anymore because we accept what ever we get from Incus, instead of “forcing” the user to be very explicit about its intentions?
I don't know how flexible the terraform plugin logic is for that, but ideally, we'd make it ignore zfs.use_refquota
being set to true
automatically if volume.zfs.use_refquota
is set to true
on the storage pool as that happens internally automatically and so isn't a change that should get reverted by Terraform.
An example would be:
1) Pool has volume.zfs.use_refquota=true
set, user asks to create a new volume and only specifies size=50GiB
. That should create the volume which will then have Incus automatically set zfs.use_refquota=true
. Running Terraform again should have it consider everything to be clean.
2) Pool doesn't have volume.zfs.use_refquota
set at all, user asks to create a new volume and only specifies size=50GiB
. That should still all be clean. Then the user manually sets zfs.use_refquota=true
on the volume. Terraform should pick that up as a manual change that needs to be reverted as the pool doesn't have volume.zfs.use_refquota=true
set on it and the user didn't ask for the volume to have zfs.use_refquota
set.
@stgraber Please have a look at https://github.com/lxc/terraform-provider-incus/pull/68 and I would be grateful if you could test whether it works as you have described.
I've noticed OpenTofu getting pretty confused about a
volume.zfs.remove_snapshots=true
config key appearing on a volume that's defined in terraform. That's because the config key was set on the parent storage pool and so propagated to the new volume.That's normal and we definitely don't want terraform to try to rectify this by deleting and re-creating the volume as it attempted here :)