Closed HarrisonWAffel closed 1 year ago
Root cause
The root cause of this issue is that Logging and cis v1 scan code support were removed in rancher 2.7 but not in the TF rancher2 provider so the provider is trying to reference types that don't exist.
Design discussion summary
Rancher v2.6 still supports logging/cis v1 scan and if we remove support in TF to fix 2.7 that will simultaneously break 2.6. This is a tech debt issue. We discussed some alternative options but concluded that we have to branch the TF provider to resolve this and prevent future debt.
For customers who are running both Rancher 2.6 and 2.7 instances, @MbolotSuse pointed out you will need two separate TF directories with configuration and state files to manage each instance, instead of just updating the TF provider version and re-running terraform init.
The latter will cause state file conflicts. This may be confusing and should be added to the TF docs to hep customers.
Version schema
Major Version alignment (2.7.x -> 3.x): Pros:
In comparison, Minor Version alignment (2.0.x for 2.6.10, 2.1.0 matches 2.7.2 of the provider) does not work for 2.1.x (2.7). We will likely be introducing new features (PSA), which would require a minor version increase. But we won't be able to do that with this schema option since the minor version is locked.
Minor Version alignment works for 2.6 but not for 2.7. Major Version alignment will require a compatibility matrix but works for both and has more flexibility for OOB releases.
Design plan
Design plan
release/v2.6
before Harrison's changes for rancher 2.6 and master
for rancher 2.7.2.0.0
for rancher v2.6.10 on release/v2
and 3.0.0
on master
for upcoming rancher v2.7.2. This uses semver and aligns the TF major version with the Rancher minor version while still allowing for OOB flexibility. No need to backport at this time.Future PRs
Also from @snasovich: I’ve got a soft “OK” from product on maintaining separate release lines for TF provider for minor Rancher release lines. We carried over some more high-level discussions about the whole “support story” for Rancher TF provider to next week, but it should not affect this decision.
TF rancher2 provider 2.0.0
and 3.0.0
releases are targeted for together/after upcoming Rancher feature releases.
The root cause of this issue is that Logging and cis v1 scan code support were removed in rancher 2.7 but not in the TF rancher2 provider so the provider is trying to reference types that don't exist.
I have branched the TF provider, per this discussion into release/v2 for rancher 2.6 and still master
for rancher 2.7. To fix the TF build, I am doing the following in this PR
v2.7-head
commitTF rancher2 cluster or any cluster that can run k8s 1.25 and could have logging or cis v1 enabled previously (I chose RKE on EC2 nodes in this case).
Note for QA: This can only be tested when Terraform 1.25.x is released.
``` terraform { required_providers { rancher2 = { source = "terraform/rancher2" version = "1.25.0" } } } provider "rancher2" { api_url = var.rancher_api_url token_key = var.rancher_admin_bearer_token insecure = true } data "rancher2_cloud_credential" "rancher2_cloud_credential" { name = var.cloud_credential_name } resource "rancher2_cluster" "rancher2_cluster" { name = var.cluster_name rke_config { kubernetes_version = "v1.25.5-rancher1-1" network { plugin = var.network_plugin } } } resource "rancher2_node_template" "rancher2_node_template" { name = var.node_template_name amazonec2_config { access_key = var.aws_access_key secret_key = var.aws_secret_key region = var.aws_region ami = var.aws_ami security_group = [var.aws_security_group_name] subnet_id = var.aws_subnet_id vpc_id = var.aws_vpc_id zone = var.aws_zone_letter root_size = var.aws_root_size instance_type = var.aws_instance_type } } resource "rancher2_node_pool" "pool1" { cluster_id = rancher2_cluster.rancher2_cluster.id name = "pool1" hostname_prefix = "tf-pool1-" node_template_id = rancher2_node_template.rancher2_node_template.id quantity = 1 control_plane = false etcd = true worker = false } resource "rancher2_node_pool" "pool2" { cluster_id = rancher2_cluster.rancher2_cluster.id name = "pool2" hostname_prefix = "tf-pool2-" node_template_id = rancher2_node_template.rancher2_node_template.id quantity = 1 control_plane = true etcd = false worker = false } resource "rancher2_node_pool" "pool3" { cluster_id = rancher2_cluster.rancher2_cluster.id name = "pool3" hostname_prefix = "tf-pool3-" node_template_id = rancher2_node_template.rancher2_node_template.id quantity = 1 control_plane = false etcd = false worker = true } ```
I don't think there's a likely chance of regressions here. Code that was blocking a TF build has been removed. Some of those types still exist in rancher, but if they were added back into the TF provider that wouldn't be a regression it would be reinstating a feature.
Yes.
Blocked -- waiting on Terraform 3.0.0 for Rancher v2.7.x.
Ping for QA: This is ready to test using Terraform rancher2 v3.0.0-rc1. Please setup local testing on the rc version of the provider with this command
./setup-provider.sh rancher2 3.0.0-rc1
@Sahota1225 @Anna-Blendermann If this is a tech-debt issue, does this need QA Validation?
I'd say probably not at this point. If any testing any of the other TF issues with 3.0.0-rc1
fails with build errors related to scan or cis v1 logging this can be reopened.
Terraform fails to build when using Rancher commits from 2.7, but is able to build when using commits from 2.6. It seems some types were moved around in 2.7, breaking the following structure files:
structure_cluster_logging.go
structure_cluster_scan.go
structure_logging_custom_target_config.go
Builds will fail with the following error
This error seems to be due to the fact that
"github.com/rancher/rancher/pkg/client/generated/management/v3"
package no longer contains the following typesCustomTargetConfig
,ClusterLogging
,ClusterScan
,CisScanConfig
, andClusterScanConfig
The resolution of this issue may be as simple as fixing import paths, but could also be more involved depending on the changes that these types have undergone in 2.7.