Closed TGM closed 1 year ago
Thanks for the issue, good point, I forgot to check all inter resources relationships.
Actually it possible to tie a datastore/host to a cluster in two ways and the way it's implemented for now is problematic.
When you attach an host to a cluster via the cluster_id
field like you did above, the hosts
section of the cluster is still empty in the TF file, but when the provider read the cluster datas on cloud side it see that there is some hosts tied.
And then it's sees a diff on the cluster resource and believe it has to remove the hosts and the clusters.
We need to remove the direct dependencies between the two resources.
Add Computed
behavior (in addition to Optional
) on the hosts
section of the cluster
solve the above case but this doesn't solve this case, there is still a diff after the initial apply:
resource "opennebula_cluster" "cluster" {
name = "cluster1"
hosts = [
opennebula_host.host1.id,
]
}
resource "opennebula_host" "host1" {
name = "host1"
type = "custom"
custom {
virtualization = "dummy"
information = "dummy"
}
}
resource "opennebula_host" "host2" {
name = "host2"
type = "custom"
cluster_id = opennebula_cluster.cluster.id
custom {
virtualization = "dummy"
information = "dummy"
}
}
This may be solved in reading only the configured elements of the hosts
section of the cluster
resource but with this solution we won't be able to import the hosts and datastores of the resource anymore.
In addition, this adds problems when updating the hosts
fields of the cluster
resource.
Some other ideas:
cluster_id
from the host
resource, or removing hosts
section from the cluster
resources.
We had the same problem with user
and group
membership in #191 and this was solve in this waycluster_host
and cluster_datastore
, to extract the cluster
resource dependencies and deprecate hosts
and datastores
sections. However we still have the diff with the TF shown aboveAn other thing that is not consistent and may be confusing in the provider across resources is the way we delete a group VS the way we delete a cluster:
group
will previously remove each users
membercluster
will fails because this wont remove each hosts
and datastores
members
However this should be tracked in an other issue...I'll probably open one or two issues (on group
and users
) after that to have a consistent behavior on membership management between resources
The proposed solution does not allow you to associate existing resources to the cluster.
You still have the cluster_id
field in datastore and in host resources to manage membership.
If we keep two distinct places to manage cluster membership we'll still have some diffs if we use both at the same time like I said with:
resource "opennebula_cluster" "cluster" {
name = "cluster1"
hosts = [
opennebula_host.host1.id,
]
}
resource "opennebula_host" "host1" {
name = "host1"
type = "custom"
custom {
virtualization = "dummy"
information = "dummy"
}
}
resource "opennebula_host" "host2" {
name = "host2"
type = "custom"
cluster_id = opennebula_cluster.cluster.id
custom {
virtualization = "dummy"
information = "dummy"
}
}
To retrieve the list of cluster member I propose to modify cluster fields hosts
and datastores
from Optional
to Computed
this would allow to know (but without being able to modify from this place) who is member of the cluster from the cluster resource.
We could instead manage membership from the cluster resource but I did this way because:
group
and users
similar membership problemI had to make a choice, I knew these arguments, but it's not a big deal to manage membership from the cluster resource so if everyone think it's better and I'm open to do like this. If you have another idea feel free to drop an other comment I'm open to discuss and rewrite this PR
The proposed solution is good, but we allso need a way to associate existing resources. For example, associate the default DS with a new cluster.
Ok so membership management would be less exclusive from the cluster, PR updated.
Seems legit. Can you add it as a RC release or something so we can test it in pre-prod?
The 1.1.1-rc1
will be released by the end of the week.
Partially fixed, new datastores are allocated to default cluster as well.
resource "opennebula_cluster" "cluster" {
name = "${var.cluster_name}-${var.cluster_env}-${var.cluster_cpu_vendor}"
hosts = [for host in opennebula_host.host: host.id]
datastores = [for datastore in opennebula_datastore.datastore: datastore.id]
# virtual_networks = [145]
}
resource "opennebula_datastore" "datastore" {
count = var.cluster_datastores
name = "${var.cluster_name}-${var.cluster_env}-${var.cluster_cpu_vendor}-ds${count.index + 1}"
type = "system"
custom {
transfer = "ssh"
}
}
resource "opennebula_host" "host" {
for_each = toset(var.cluster_hosts)
name = each.key
type = "kvm"
}
In the actual way to manage cluster membership that we just modified for the last RC release of the provider:
A new datastore is added to the default cluster due to these points:
To fix this we could:
A few ideeas to consider:
Considering your first idea would mean that we use cluster_id
in the datastore
to specify an other cluster than default.
It's like we return to the initial problems of this issue and rollback our last changes.
We could go this way, i.e. manage the dependency from datastore/host side via the cluster_id
field (and then deprecate hosts
and datastores
sections of the cluster resource).
Then we need to consider your comment when I already proposed this solution to see how we could solve it in another way:
The answer could be: you needs to import the default resources, and then manage their membership via the cluster_id
attribute.
It's still possible to rollback these changes as the provider hasn't been release with it's related attributes changes...
Keep the current RC1 changes, they are way better than the previous version. We could go this way in order to solve the problem without any further changes:
And this might be the best ideea yet. That way we don't loose our current gained flexibility and resources are still managed individually.
Update: just realized that updating the default cluster, won't work without importing the resource ...
It's interesting behavior, that when hosts are created, they are associated correctly only to the new cluster. I wonder why ...
Related?
opennebula_cluster.cluster: Destroying... [id=174]
╷
│ Error: Failed to delete
│
│ cluster (ID: 174): OpenNebula error [ACTION]: [one.cluster.delete] Cannot delete cluster. Cluster 174 is not empty, it contains 2 hosts.
╵
Related?
opennebula_cluster.cluster: Destroying... [id=174] ╷ │ Error: Failed to delete │ │ cluster (ID: 174): OpenNebula error [ACTION]: [one.cluster.delete] Cannot delete cluster. Cluster 174 is not empty, it contains 2 hosts. ╵
You can't delete cluster with host/datastore inside it's an OpenNebula constraint and the provider doesn't try add behavior to empty the cluster from it's member before deleting it.
It's interesting behavior, that when hosts are created, they are associated correctly only to the new cluster. I wonder why ...
As I said in this previous comment, an host is only in one cluster at a time (datastore could be in several clusters at the same time), so when you add it to a new cluster, OpenNebula automatically remove it from the other. This is not related to the provider.
Related?
opennebula_cluster.cluster: Destroying... [id=174] ╷ │ Error: Failed to delete │ │ cluster (ID: 174): OpenNebula error [ACTION]: [one.cluster.delete] Cannot delete cluster. Cluster 174 is not empty, it contains 2 hosts. ╵
You can't delete cluster with host/datastore inside it's an OpenNebula constraint and the provider doesn't try add behavior to empty the cluster from it's member before deleting it.
In that case, you can't use terraform destroy.
It seems to ignore depends_on as well. Any ideeas?
Infortunately, it seems that Terraform need a bit of help with the dependencies here, it try to delete the cluster first but this doesn't follow OpenNebula rules. Terraform is aware of dependencies so using depends_on you'll probably create a dependency cycle.
With the first idea of managing dependencies from host/datastore I think that terraform would try to delete host and datastores first.
So, for now you need to make it in two apply: first detach the hosts and datastores in a first apply, then destroy the cluster in a second apply.
Otherwise, it's about how we want to design the provider, we could add some optional behavior to the cluster resource to remove the dependencies before deleting the cluster in only one step.
Keep the cluster_id attributes in opennebula_host, opennebula_datastore and opennebula_virtual_network deprecate opennebula_cluster / datastores deprecate opennebula_cluster / hosts deprecate opennebula_cluster / virtual_network
OpenNebula version: 6.4.0 terraform-provider-opennebula: 1.1.0
On the second run the opennebula_cluster are removed from cluster.
Plan:
Variables:
Run 1:
Run 2:
Expected behavior: No change is expected.