rancher / terraform-provider-rancher2

Terraform Rancher2 provider
https://www.terraform.io/docs/providers/rancher2/
Mozilla Public License 2.0
260 stars 223 forks source link

[BUG] Downstreamcluster not deploying when using subfolders in vpshere #1415

Open Suschio opened 2 weeks ago

Suschio commented 2 weeks ago

Rancher Server Setup

Information about the Cluster

User Information

Provider Information

Describe the bug

If we use subfolder structure for vpshere_folder (DC/vm/kubernetes/testkubernetes) then the downstream cluster will get stuck in a loop of

"Waiting for all etcd machines to be deleted"
"Wating for init node"

Looking into the config in the RancherUI we can see an error with "pool1: The provided value for folder was not found in the list of expected values. This can happen with clusters provisioned outside of Rancher or when options for the provider have changed." But all configuration are correct and the category folder has an entry with the correct path.

If we create the same cluster with RancherUI and choose the path it will succeed without problems. And if we create the same cluster with terraform and but with only one folder (DC/vm/kubernetes) it works too.

The provision pod just shows this error error loading host test-xxx Docker machine "test-xxx" does not exist. Use "docker-machine ls" to list machines. Use "docker-machine create" to add a new one.

To Reproduce

Use Rancher2 provider with the ressources: machine_config_v2 [vsphere_config] folder = "DC/vm/folder/somesubfolder and rancher2_cluster_v2 Resource

Actual Result

Cluster not deploying and stuck in a loop with "Waiting for all etcd machines to be deleted" "Wating for init node"

with config error

"pool1: The provided value for folder was not found in the list of expected values. This can happen with clusters provisioned outside of Rancher or when options for the provider have changed."

Expected Result

We expect a working downstream cluster that is deployed with terraform and can use subfolders in vsphere.

niklas-letz commented 1 week ago

Same problem! Help would be very much appreciated! Similar issue: [BUG] hostsystem error for cluster provisioned with vsphere provider #10460

Suschio commented 1 week ago

After some research we found that if you pass /DC/vm/folder as folder variable in rancher2_machine_config_v2 Resource. The VM gets created without issues. But if you add a folder like /DC/vm/folder/folder we get an error in vmwarevspheremachines.rke-machine.cattle.io

folder: /DC-xxxx/vm/xxxxxx/xxxxxx
  - message: |-
      failed creating server [fleet-default/xxx-test-cluster-master-xxxx-xxxx] of kind (VmwarevsphereMachine) for machine xxxx-test-cluster-master-xxxxx-xxxin infrastructure provider: CreateError: Running pre-create checks...
      (xxxx-test-cluster-master-xxxx-xxxx) Connecting to vSphere for pre-create checks...
      (xxxx-test-cluster-master-xxxx-xxxx) Using datacenter /DC-XX
      (xxxx-test-cluster-master-xxxx-xxxx) Using network /DC-XXX/network/xxxx
      (xxxx-test-cluster-master-xxxx-xxxx) Using ResourcePool /DC-XXX/host/XXXX/Resources
      Error with pre-create check: "folder '/DC-XXXX/vm/DC-XXXX/vm/xxxxx/xxxxxx' not found"
    reason: CreateError
    status: "False"

The Folder gets passed correctly but then somehow gets "doubled" in the pre-create check. If we declare the folder variable without /DC/vm/ it works but we still have an error in the UI of this cluster with "pool1: The provided value for folder was not found in the list of expected values. This can happen with clusters provisioned outside of Rancher or when options for the provider have changed."