cloud-native-toolkit / planning

The is the planning repo to manage the cross project Epics and Issues. Tasks and Bugs
3 stars 1 forks source link

Iteration Zero run from Schematics on existing OCP 4.3 fails, endless loop waiting for ArgoCD operator to install #413

Closed bwoolf1 closed 3 years ago

bwoolf1 commented 4 years ago

Describe the bug The Schematics tile in the cloudnative-toolkit repo got stuck in an endless loop waiting for ArgoCD operator to install. The install was onto an existing ROKS OCP 4.3 cluster using Classic Infrastructure.

To Reproduce Steps to reproduce the behavior:

  1. Create an OCP 4.3 cluster using Classic Infrastructure
  2. Follow the cloudnative-toolkit instructions to create a private catalog and install the tile
  3. Follow the instructions to run the tile, which installs the Toolkit into the cluster
  4. Navigate to the running Schematics workspace and view logs
  5. The logs show that after creating the config map and pull secrets, the script gets stuck installing the ArgoCD operator:
    2020/06/17 19:36:26 Terraform apply | module.dev_sre_namespace.null_resource.copy_cloud_configmap: Creating...
    2020/06/17 19:36:26 Terraform apply | module.dev_sre_namespace.null_resource.copy_cloud_configmap: Provisioning with 'local-exec'...
    2020/06/17 19:36:26 Terraform apply | module.dev_sre_namespace.null_resource.copy_cloud_configmap (local-exec): Executing: ["/bin/sh" "-c" ".terraform/modules/dev_sre_namespace/scripts/copy-configmap-to-namespace.sh ibmcloud-config ibm-observe"]
    2020/06/17 19:36:27 Terraform apply | module.dev_sre_namespace.null_resource.copy_cloud_configmap (local-exec): *** Copying ibmcloud-config from default namespace to ibm-observe namespace
    2020/06/17 19:36:28 Terraform apply | module.dev_sre_namespace.null_resource.copy_cloud_configmap (local-exec): Flag --export has been deprecated, This flag is deprecated and will be removed in future.
    2020/06/17 19:36:29 Terraform apply | module.dev_sre_namespace.null_resource.copy_cloud_configmap (local-exec): configmap/ibmcloud-config created
    2020/06/17 19:36:29 Terraform apply | module.dev_sre_namespace.null_resource.copy_cloud_configmap: Creation complete after 2s [id=1091557891990891042]
    2020/06/17 19:36:29 Terraform apply | module.dev_sre_namespace.null_resource.create_pull_secrets: Creating...
    2020/06/17 19:36:29 Terraform apply | module.dev_sre_namespace.null_resource.create_pull_secrets: Provisioning with 'local-exec'...
    2020/06/17 19:36:29 Terraform apply | module.dev_sre_namespace.null_resource.create_pull_secrets (local-exec): Executing: ["/bin/sh" "-c" ".terraform/modules/dev_sre_namespace/scripts/setup-namespace-pull-secrets.sh ibm-observe"]
    2020/06/17 19:36:29 Terraform apply | module.dev_sre_namespace.null_resource.create_pull_secrets (local-exec): *** Copying pull secrets from default namespace to ibm-observe namespace
    2020/06/17 19:36:30 Terraform apply | module.dev_sre_namespace.null_resource.create_pull_secrets (local-exec): Flag --export has been deprecated, This flag is deprecated and will be removed in future.
    2020/06/17 19:36:31 Terraform apply | module.dev_sre_namespace.null_resource.create_pull_secrets (local-exec): secret/all-icr-io created
    2020/06/17 19:36:32 Terraform apply | module.dev_sre_namespace.null_resource.create_pull_secrets (local-exec): *** Adding secrets to serviceaccount/default in ibm-observe namespace
    2020/06/17 19:36:33 Terraform apply | module.dev_sre_namespace.null_resource.create_pull_secrets (local-exec): serviceaccount/default patched
    2020/06/17 19:36:33 Terraform apply | module.dev_sre_namespace.null_resource.create_pull_secrets: Creation complete after 4s [id=8590194061943871832]
    2020/06/17 19:36:33 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription: Creating...
    2020/06/17 19:36:33 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription: Provisioning with 'local-exec'...
    2020/06/17 19:36:33 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription (local-exec): Executing: ["/bin/sh" "-c" ".terraform/modules/dev_tools_argocd/scripts/deploy-subscription.sh ocp4 tools openshift-marketplace"]
    2020/06/17 19:36:33 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription (local-exec): .terraform/modules/dev_tools_argocd/scripts/deploy-subscription.sh: 7: .terraform/modules/dev_tools_argocd/scripts/deploy-subscription.sh: [[: not found
    2020/06/17 19:36:33 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription (local-exec): .terraform/modules/dev_tools_argocd/scripts/deploy-subscription.sh: 12: .terraform/modules/dev_tools_argocd/scripts/deploy-subscription.sh: [[: not found
    2020/06/17 19:36:33 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription (local-exec): .terraform/modules/dev_tools_argocd/scripts/deploy-subscription.sh: 18: .terraform/modules/dev_tools_argocd/scripts/deploy-subscription.sh: [[: not found
    2020/06/17 19:36:33 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription (local-exec): Installing argocd operator into tools namespace
    2020/06/17 19:36:34 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription (local-exec): operatorgroup.operators.coreos.com/tools-operatorgroup created
    2020/06/17 19:36:34 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription (local-exec): subscription.operators.coreos.com/argocd-operator created
    2020/06/17 19:36:36 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription (local-exec): Waiting for ArgoCD operator to install
    2020/06/17 19:36:43 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription: Still creating... [10s elapsed]
    2020/06/17 19:36:53 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription: Still creating... [20s elapsed]
    2020/06/17 19:37:03 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription: Still creating... [30s elapsed]
    2020/06/17 19:37:07 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription (local-exec): Waiting for ArgoCD operator to install
    2020/06/17 19:37:13 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription: Still creating... [40s elapsed]
    2020/06/17 19:37:23 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription: Still creating... [50s elapsed]
    2020/06/17 19:37:33 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription: Still creating... [1m0s elapsed]
    2020/06/17 19:37:37 Terraform apply | module.dev_tools_argocd.null_resource.argocd-subscription (local-exec): Waiting for ArgoCD operator to install

    Waiting for ArgoCD operator to install will loop forever until Terraform and the Schematics workspace finally time out (after about an hour?).

It looks like the dev_tools_argocd/scripts/deploy-subscription.sh script failed to parse and run properly.

Expected behavior I wanted the Toolkit to install successfully!

I was able to install the Toolkit successfully in the same cluster but not using the tile approach, but rather by using the traditional iZero approach of cloning the ibm-garage-iteration-zero repo and running ./launch.sh and ./runTerraform.sh. The argocd-subscription created without having to wait for the operator to install:

module.dev_tools_argocd.null_resource.argocd-subscription: Creating...
module.dev_tools_argocd.null_resource.argocd-subscription: Provisioning with 'local-exec'...
module.dev_tools_argocd.null_resource.argocd-subscription (local-exec): Executing: ["/bin/sh" "-c" ".terraform/modules/dev_tools_argocd/scripts/deploy-subscription.sh ocp4 tools openshift-marketplace"]
module.dev_tools_argocd.null_resource.argocd-subscription (local-exec): Installing argocd operator into tools namespace
module.dev_tools_argocd.null_resource.argocd-subscription (local-exec): operatorgroup.operators.coreos.com/tools-operatorgroup created
module.dev_tools_argocd.null_resource.argocd-subscription (local-exec): subscription.operators.coreos.com/argocd-operator created
module.dev_tools_argocd.null_resource.argocd-subscription: Creation complete after 5s [id=3115673840294636061]

Screenshots n/a

IBM Cloud Select the services and tools affected

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

seansund commented 4 years ago

Need to add a timeout waiting for ArgoCD deploy

bwoolf1 commented 4 years ago

@seansund @mjperrins: I'm trying the Schematics tile again, this time on OCP 4.3 on VPC. It's stuck waiting on Sonar Qube to install. Does the tile automatically use the latest terraform modules?

The tile's log:

 2020/06/30 14:42:35 Terraform apply | module.dev_tools_sonarqube.null_resource.sonarqube_route[0]: Creating...
 2020/06/30 14:42:35 Terraform apply | module.dev_tools_sonarqube.null_resource.sonarqube_route[0]: Provisioning with 'local-exec'...
 2020/06/30 14:42:36 Terraform apply | module.dev_tools_sonarqube.null_resource.sonarqube_route[0] (local-exec): Executing: ["/bin/sh" "-c" ".terraform/modules/dev_tools_sonarqube/scripts/create-route.sh tools sonarqube-sonarqube sonarqube"]
 2020/06/30 14:42:37 Terraform apply | module.dev_tools_sonarqube.null_resource.sonarqube_route[0] (local-exec): route.route.openshift.io/sonarqube created
 2020/06/30 14:42:37 Terraform apply | module.dev_tools_sonarqube.null_resource.sonarqube_route[0]: Creation complete after 1s [id=6249347158388910021]
 2020/06/30 14:42:37 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube: Creating...
 2020/06/30 14:42:37 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube: Provisioning with 'local-exec'...
 2020/06/30 14:42:37 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube (local-exec): Executing: ["/bin/sh" "-c" ".terraform/modules/dev_tools_sonarqube/scripts/wait-for-deployment.sh tools sonarqube-sonarqube"]
 2020/06/30 14:42:38 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube (local-exec): Waiting for deployment "sonarqube-sonarqube" rollout to finish: 0 out of 1 new replicas have been updated...
 2020/06/30 14:42:47 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube: Still creating... [10s elapsed]
 2020/06/30 14:42:57 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube: Still creating... [20s elapsed]
. . .
 2020/06/30 14:57:47 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube: Still creating... [15m10s elapsed]
 2020/06/30 14:57:57 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube: Still creating... [15m20s elapsed]
bwoolf1 commented 4 years ago

Looks like it eventually timed out:

 2020/06/30 15:02:17 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube: Still creating... [19m40s elapsed]
 2020/06/30 15:02:27 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube: Still creating... [19m50s elapsed]
 2020/06/30 15:02:36 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube (local-exec): error: deployment "sonarqube-sonarqube" exceeded its progress deadline
 2020/06/30 15:02:37 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube: Still creating... [20m0s elapsed]
 2020/06/30 15:02:47 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube: Still creating... [20m10s elapsed]
 2020/06/30 15:02:57 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube (local-exec):   %!T(MISSING)otal    %!R(MISSING)eceived %!X(MISSING)ferd  Average Speed   Time    Time     Time  Current
 2020/06/30 15:02:57 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube (local-exec):                                  Dload  Upload   Total   Spent    Left  Speed
 2020/06/30 15:02:57 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube (local-exec):   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
 2020/06/30 15:02:57 Terraform apply | module.dev_tools_sonarqube.null_resource.wait-for-sonarqube (local-exec):   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
seansund commented 4 years ago

Has been addressed in v2.8.0 of the argocd module