Closed andersjohansson2021 closed 1 year ago
Any update on this? I'm seeing the same issue.
│ When expanding the plan for module.cluster.rancher2_cluster_v2.cluster to include new values learned so far
│ during apply, provider "registry.terraform.io/rancher/rancher2" produced an invalid new value for
│ .rke_config[0].machine_pools[0].cloud_credential_secret_name: was cty.StringVal(""), but now
│ cty.StringVal("cattle-global-data:cc-f9hbf").
│
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
@frouzbeh I had too use an older version as stated above. That solved the issue for me. But having said that this bug need to be adressed for the provider.
@andersjohansson2021 Thank you, yes I tested with older version and it works, but as I remember older version had another issue with kube-config generation function which is fixed in 1.22.2. I hope they fix it soon.
@rawmind0 Would you please take a look at this issue. We really would like to make this work.
+1 here as well
Hello , Any workaround or fix this issue . I am stuck at this issue .
@frouzbeh I had too use an older version as stated above. That solved the issue for me. But having said that this bug need to be adressed for the provider.
older version brings other issues , like missing or unsupported argumnets etc.
Hello, this issue is also causing problems on my deploys. Is there a commitment to fix it ?
Reproduced the issue on v2.6-head c54b655
cloud provider - Linode
v1.23.10+rke2r1
. v1.22.13+rke2r1
and rke1 node driver cluster on k8s v1.23.10-rancher1-1
Error: Provider produced inconsistent final plan
│
│ When expanding the plan for rancher2_cluster_v2.rke2-cluster-tf to include new values learned so far during apply, provider "registry.terraform.io/rancher/rancher2" produced an invalid new
│ value for .rke_config[0].machine_pools[0].cloud_credential_secret_name: was cty.StringVal(""), but now cty.StringVal("cattle-global-data:<redacted>").
│
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵
As workaround I tried this and worked. Create a cloud credential and then grab the cloud credential id using a data block.
Example:
resource "rancher2_cloud_credential" "rancher2_cloud_credential" {
name = var.cloud_credential_name
amazonec2_credential_config {
access_key = var.aws_access_key
secret_key = var.aws_secret_key
default_region = var.aws_region
}
}
data "rancher2_cloud_credential" "rancher2_cloud_credential" {
name = var.cloud_credential_name
}
Then use data.rancher2_cloud_credential.rancher2_cloud_credential.id
in rancher2_cluster_v2
machine configs.
Note: this only work having the cloud credential created beforehand it seems
Still in version 1.24.2 when creating RKE2 downstream clusters on Azure:
Error: Provider produced inconsistent final plan
When expanding the plan for rancher2_cluster_v2.cluster_az to include new values learned so far during apply, provider "registry.terraform.io/rancher/rancher2" produced an invalid new value for .rke_config[0].machine_pools[0].cloud_credential_secret_name: was cty.StringVal(""), but now cty.StringVal("cattle-global-data:cc-ffs8c").
This is a bug in the provider, which should be reported in the provider's own issue tracker.
Error: Provider produced inconsistent final plan
When expanding the plan for rancher2_cluster_v2.cluster_az to include new values learned so far during apply, provider "registry.terraform.io/rancher/rancher2" produced an invalid new value for .rke_config[0].machine_pools[0].name: was cty.StringVal(""), but now cty.StringVal("pool-b94345").
This is a bug in the provider, which should be reported in the provider's own issue tracker.
Looking at #878, I don't believe that it will fix both plan inconsistencies
Still seeing this error on v1.25.0:
│
│ When expanding the plan for rancher2_cluster_v2.utility to include new values learned so far during apply, provider "registry.terraform.io/rancher/rancher2"
│ produced an invalid new value for .rke_config[0].machine_pools[0].cloud_credential_secret_name: was cty.StringVal(""), but now
│ cty.StringVal("cattle-global-data:cc-pmzs7").
│
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
it usually works the second time
Running terraform apply
a second time also consistently works for me.
On the terraform destroy
I also had a dependency issue but that could be fixed with adding:
depends_on = [rancher2_cloud_credential.my_cloud_credential]
to the rancher2_cluster_v2
resource.
Is there another depends_on
-like or sleep-like thing you could do to get the apply
working on the first try?
Yes - I can confirm that it works the second time, most likely because the credential is already there from the first try. Thanks for the hint about the dependency!
Facing this as well. Any plan to fix this bug?
As noted here, the only workaround is to create the cloud credential before running terraform apply - on a different apply or manually via UI Otherwise - my automation to create clusters is failing.
Hello @moshiaiz,
I am working on this. Thank you all for your patience.
Terraform rancher2 provider with rancher 2.7 builds are currently blocked for us due to https://github.com/rancher/terraform-provider-rancher2/issues/1052. We need to branch and fix our build before I can reproduce this issue.
From my investigation, there is indeed a bug in the way the provider is processing the value for .rke_config[0].machine_pools[0].cloud_credential_secret_name.
From the terraform docs, this field exists in both the cluster_v2
resource and its machine pool but offhand will need to find out why its present in the machine pool. When connecting to a rancher instance, there is only 1 cloud credential needed to connect to the instance so it may be a duplicate field.
This old PR is a potential fix https://github.com/rancher/terraform-provider-rancher2/pull/878 and should also fix https://github.com/rancher/terraform-provider-rancher2/issues/915 since rke is being installed on vSphere and this appears to be a bug in the terraform rke config.
The main PR has been merged for https://github.com/rancher/terraform-provider-rancher2/issues/1052 and the TF build is fixed. Testing is unblocked. Trying to reproduce this for an RKE2 cluster on provider version 1.25.0
Reproduced this issue on Amazon EC2 RKE2 cluster with TF provider 1.25.0.
``` terraform { required_providers { rancher2 = { source = "rancher/rancher2" version = "1.25.0" } } } provider "rancher2" { api_url = var.rancher_api_url token_key = var.rancher_admin_bearer_token insecure = true } # Create amazonec2 cloud credential resource "rancher2_cloud_credential" "foo" { name = "foo" amazonec2_credential_config { access_key = var.aws_access_key secret_key = var.aws_secret_key } } # Create amazonec2 machine config v2 resource "rancher2_machine_config_v2" "foo" { generate_name = "ablender-machine" amazonec2_config { ami = var.aws_ami region = var.aws_region security_group = [var.aws_security_group_name] subnet_id = var.aws_subnet_id vpc_id = var.aws_vpc_id zone = var.aws_zone_letter } } # Create a new rancher v2 amazonec2 RKE2 Cluster v2 resource "rancher2_cluster_v2" "ablender-rke2" { name = var.rke2_cluster_name kubernetes_version = "v1.25.6-rancher1-1" enable_network_policy = false default_cluster_role_for_project_members = "user" cloud_credential_secret_name = rancher2_cloud_credential.foo.id rke_config { machine_pools { name = "pool1" cloud_credential_secret_name = rancher2_cloud_credential.foo.id control_plane_role = true etcd_role = true worker_role = true quantity = 1 machine_config { kind = rancher2_machine_config_v2.foo.kind name = rancher2_machine_config_v2.foo.name } } } } ```
The error showed up on the first terraform apply
It worked when running terraform apply
a second time as posted above so that is a valid workaround.
Investigation
After more digging, I've discovered that this error is/similar to a very popular error https://github.com/hashicorp/terraform-provider-aws/issues/19583 in the Terraform provider AWS that has been very active over the past two years and that Hashicorp refuses to acknowledge or fix.
I discovered this error in the TF debug logs
2023-02-10T13:43:54.804-0500 [WARN] Provider "terraform.example.com/local/rancher2" produced an invalid plan for rancher2_cluster_v2.ablender-rke2, but we are tolerating it because it is using the legacy plugin SDK.
The following problems may be the cause of any confusing errors from downstream operations:
- .fleet_namespace: planned value cty.StringVal("fleet-default") for a non-computed attribute
- .rke_config[0].machine_selector_config: attribute representing nested block must not be unknown itself; set nested attribute values to unknown instead
- .rke_config[0].etcd: attribute representing nested block must not be unknown itself; set nested attribute values to unknown instead
- .rke_config[0].machine_pools[0].cloud_credential_secret_name: planned value cty.StringVal("") does not match config value cty.UnknownVal(cty.String)
From poking around and according to Hashicorp https://discuss.hashicorp.com/t/context-around-the-log-entry-tolerating-it-because-it-is-using-the-legacy-plugin-sdk/1630, most of these warnings are due to an expected SDK compatibility quirk but the error for cloud_credential_secret_name
is causing the apply to fail.
Full error ends in this
2023-02-10T13:26:42.997-0500 [ERROR] vertex "rancher2_cluster_v2.ablender-rke2" error: Provider produced inconsistent final plan
╷
│ Error: Provider produced inconsistent final plan
│
│ When expanding the plan for rancher2_cluster_v2.ablender-rke2 to include new values learned so far during apply,
│ provider "terraform.example.com/local/rancher2" produced an invalid new value for
│ .rke_config[0].machine_pools[0].cloud_credential_secret_name: was cty.StringVal(""), but now
│ cty.StringVal("cattle-global-data:cc-mzjcm").
│
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
Root cause
Something on the backend in the terraform-plugin-sdk is computing a planned value of "" for cloud_credential_secret_name when it is set to Required and set as a string in the config file. This cannot be fixed in the Terraform provider. It appears to be a bug in the sdk that the provider is using.
Fix
I tried updating the Terraform plugin SDK and that did not work, but setting machine pool cloud_credential_secret_name
as Optional does fix it. This patch allows us to retain parity between Rancher and the Terraform provider and may be the most viable option to fix this issue for the scores of customers who have been running into this issue every few weeks. I will update my draft PR shortly.
When creating an RKE2 cluster via Terraform on any hosted provider (Amazon EC2, Azure, Linode driver so far), Terraform computes a new value for a duplicate field cloud_credential_secret_name
in the machine pool and then throws an error on a terraform apply pertaining to that value.
This PR has the following fix
machine_pool.cloud_credential_secret_name
to be Optional. This keeps parity with Terraform and fixes the plan bugTest steps
v2.7-head
rancher2_cluster_v2.cloud_credential_secret_name
and rancher2_cluster_v2.rke_config.machine_pools.cloud_credential_secret_name
set in main.tf (each one set, then both)terraform init
terraform apply
terraform apply
and provisions successfully``` terraform { required_providers { rancher2 = { source = "rancher/rancher2" version = "3.0.0" } } } provider "rancher2" { api_url = var.rancher_api_url token_key = var.rancher_admin_bearer_token insecure = true } # Create amazonec2 cloud credential resource "rancher2_cloud_credential" "foo" { name = "foo" amazonec2_credential_config { access_key = var.aws_access_key secret_key = var.aws_secret_key } } # Create amazonec2 machine config v2 resource "rancher2_machine_config_v2" "foo" { generate_name = "ablender-machine" amazonec2_config { ami = var.aws_ami region = var.aws_region security_group = [var.aws_security_group_name] subnet_id = var.aws_subnet_id vpc_id = var.aws_vpc_id zone = var.aws_zone_letter root_size = var.aws_root_size } } # Create a new rancher v2 amazonec2 RKE2 Cluster v2 resource "rancher2_cluster_v2" "ablender-rke2" { name = var.rke2_cluster_name cloud_credential_secret_name = rancher2_cloud_credential.foo.id // test case kubernetes_version = "v1.25.6+rke2r1" enable_network_policy = false default_cluster_role_for_project_members = "user" rke_config { machine_pools { name = "pool1" cloud_credential_secret_name = rancher2_cloud_credential.foo.id // test case control_plane_role = true etcd_role = true worker_role = true quantity = 1 machine_config { kind = rancher2_machine_config_v2.foo.kind name = rancher2_machine_config_v2.foo.name } } } } ```
Terraform rancher2 provider, rke1 prov
Yes.
Blocked -- waiting on Terraform 3.0.0 for Rancher v2.7.x.
Thank you for the investigations so far, I tested 3.0.0-rc1 since I have the same problem, were first apply fails and second apply works.
In my case I could track it down to machine_global_config
being built up with a "known after apply" value.
rke_config {
machine_global_config = yamlencode({
cni = "calico"
profile = "cis-1.6"
tls-san = [
module.vip_control_plane.fqdn,
]
})
}
If I remove the tls-san
value, the problem doesn't happen on first try.
Anything I could test or investigate?
@sowmyav27 This is ready to test using Terraform rancher2 v3.0.0-rc1. Please setup local testing on the rc version of the provider with this command
./setup-provider.sh rancher2 3.0.0-rc1
@a-blender shall I open a dedicated issue? But I assume this is a general problem for "known during apply" values.
With Docker on a single-node instance using Rancher v2.7-64c5188a5394f7ef7858ebb6807072ad5abe0e80-head:
Verified with rancher2 provider v3.0.0-rc2
:
v2.7-head
Screenshots:
When terraforming a RKE2 cluster i receive the following: │ Error: Provider produced inconsistent final plan │ │ When expanding the plan for rancher2_cluster_v2.test-cluster to include new values learned so far during apply, provider │ "registry.terraform.io/rancher/rancher2" produced an invalid new value for │ .rke_config[0].machine_pools[1].cloud_credential_secret_name: was cty.StringVal(""), but now │ cty.StringVal("cattle-global-data:cc-trrz8"). │ │ This is a bug in the provider, which should be reported in the provider's own issue tracker.
This happends in version 1.22.1 not in 1.21.0 of the provider. /Anders.
SURE-5412 SURE-4866