terraform-ibm-modules / terraform-ibm-base-ocp-vpc

Provisions a Red Hat OpenShift VPC cluster on IBM Cloud
Apache License 2.0
2 stars 7 forks source link

Problem with the deployement of terraform-ibm-base-ocp-vpc basic example #448

Closed KKooli closed 4 months ago

KKooli commented 4 months ago

Deploying the basic example of terraform-ibm-base-ocp-vpc/examples/basic at main · terraform-ibm-modules/terraform-ibm-base-ocp-vpc · GitHub produces errors

Affected modules

(https://github.com/terraform-ibm-modules/terraform-ibm-base-ocp-vpc/tree/main/examples/basic)

Terraform CLI and Terraform provider versions

Terraform v1.6.6 on linux_amd64

Terraform output

module.ocp_base.ibm_container_vpc_cluster.cluster[0]: Creation complete after 1h3m44s [id=cpg8c6af0nlda104kmp0] module.ocp_base.data.ibm_container_cluster_config.cluster_config[0]: Reading... module.ocp_base.data.ibm_container_vpc_worker_pool.all_pools["default"]: Reading... module.ocp_base.data.ibm_container_addons.existing_addons: Reading... module.ocp_base.data.ibm_container_vpc_worker_pool.all_pools["default"]: Read complete after 1s [id=cpg8c6af0nlda104kmp0-4233f38] module.ocp_base.data.ibm_container_addons.existing_addons: Read complete after 1s [id=cpg8c6af0nlda104kmp0] module.ocp_base.data.ibm_container_cluster_config.cluster_config[0]: Still reading... [10s elapsed] module.ocp_base.data.ibm_container_cluster_config.cluster_config[0]: Still reading... [20s elapsed] module.ocp_base.data.ibm_container_cluster_config.cluster_config[0]: Still reading... [30s elapsed] module.ocp_base.data.ibm_container_cluster_config.cluster_config[0]: Still reading... [40s elapsed] ╷ │ Error: [ERROR] Error downloading the cluster config [cpg8c6af0nlda104kmp0]: Get "https://c113-e.eu-de.containers.cloud.ibm.com:31049/.well-known/oauth-authorization-server": cannotconnect │ │ with module.ocp_base.data.ibm_container_cluster_config.cluster_config[0], │ on .terraform/modules/ocp_base/main.tf line 257, in data "ibm_container_cluster_config" "cluster_config": │ 257: data "ibm_container_cluster_config" "cluster_config" { │ │ --- │ id: terraform-82a5043b │ summary: '[ERROR] Error downloading the cluster config [cpg8c6af0nlda104kmp0]: Get │ "https://c113-e.eu-de.containers.cloud.ibm.com:31049/.well-known/oauth-authorization-server": │ cannotconnect' │ severity: error │ resource: (Data) ibm_container_cluster_config │ operation: read │ component: │ name: github.com/IBM-Cloud/terraform-provider-ibm │ version: 1.65.1 │ --- │ ╵

Debug output

trace-log.txt

Expected behavior

Terraform apply exits successfully after creating the cluster

Actual behavior

Error: [ERROR] Error downloading the cluster config [cpg8c6af0nlda104kmp0]: Get "https://c113-e.eu-de.containers.cloud.ibm.com:31049/.well-known/oauth-authorization-server": cannotconnect │ │ with module.ocp_base.data.ibm_container_cluster_config.cluster_config[0], │ on .terraform/modules/ocp_base/main.tf line 257, in data "ibm_container_cluster_config" "cluster_config": │ 257: data "ibm_container_cluster_config" "cluster_config" {

Steps to reproduce (including links and screen captures)

Terraform apply the basic example provided by the module, subsequent apply or plan will break

  1. Run terraform apply
  2. Run terraform plan or apply

By submitting this issue, you agree to follow our Code of Conduct

lionelmace commented 4 months ago

I tried and it also failed in my account

Error: local-exec provisioner error
with module.ocp_base.null_resource.confirm_network_healthy[0]
on ../../main.tf line 393, in resource "null_resource" "confirm_network_healthy":
  provisioner "local-exec" {
Error running command '../../scripts/confirm_network_healthy.sh': exit status 1. Output: Running script to ensure kube master can communicate with all worker nodes..
../../scripts/confirm_network_healthy.sh: line 5: kubectl: command not found
No calico-node pods found. Retrying in 10s. (Attempt 1 / 10)
../../scripts/confirm_network_healthy.sh: line 5: kubectl: command not found
No calico-node pods found. Retrying in 10s. (Attempt 2 / 10)
../../scripts/confirm_network_healthy.sh: line 5: kubectl: command not found
No calico-node pods found. Retrying in 10s. (Attempt 3 / 10)
../../scripts/confirm_network_healthy.sh: line 5: kubectl: command not found
No calico-node pods found. Retrying in 10s. (Attempt 4 / 10)
../../scripts/confirm_network_healthy.sh: line 5: kubectl: command not found
No calico-node pods found. Retrying in 10s. (Attempt 5 / 10)
../../scripts/confirm_network_healthy.sh: line 5: kubectl: command not found
No calico-node pods found. Retrying in 10s. (Attempt 6 / 10)
../../scripts/confirm_network_healthy.sh: line 5: kubectl: command not found
No calico-node pods found. Retrying in 10s. (Attempt 7 / 10)
../../scripts/confirm_network_healthy.sh: line 5: kubectl: command not found
No calico-node pods found. Retrying in 10s. (Attempt 8 / 10)
../../scripts/confirm_network_healthy.sh: line 5: kubectl: command not found
No calico-node pods found. Retrying in 10s. (Attempt 9 / 10)
../../scripts/confirm_network_healthy.sh: line 5: kubectl: command not found
No calico-node pods found. Retrying in 10s. (Attempt 10 / 10)
No calico-node pods found after 10 attempts. Exiting.
lionelmace commented 4 months ago

@ocofaigh FYI

ocofaigh commented 4 months ago

@KKooli what happens when you try to apply again? does it fail with same error?

@lionelmace The issue you pasted above is because your runtime does not have kubectl installed. I realised that we are missing this from the readme, so I created an issue to add it: https://github.com/terraform-ibm-modules/terraform-ibm-base-ocp-vpc/issues/449 Please install kubectl and try again

lionelmace commented 4 months ago

I launched the tf script from Hashicorp Terraform Cloud so installing kubectl is not an option!!

KKooli commented 4 months ago

@ocofaigh yes it fails with the same error

vburckhardt commented 4 months ago

The kubectl dependency can be skipped by setting the input variable verify_worker_network_readiness to false.

https://github.com/terraform-ibm-modules/terraform-ibm-base-ocp-vpc/blob/728b96e48687cf503730a2c5167be5f167001766/variables.tf#L264C11-L264C42

We'll need to do some more investigations to pick the best approach to fully support Hashcorp Tf Cloud - the modules are not currently tested there.

ocofaigh commented 4 months ago

Also to clarify, the original error Error: [ERROR] Error downloading the cluster config [cpg8c6af0nlda104kmp0]: Get "https://c113-e.eu-de.containers.cloud.ibm.com:31049/.well-known/oauth-authorization-server": cannotconnect seems to be a proxy not allowing this access. @lionelmace is working with consumers to debug

lionelmace commented 4 months ago

Issue can be closed as this issue is identified and not due to this repo.