jx3-gitops-repositories / jx3-terraform-eks

Jenkins X 3.x Infrastructure Git Template for Terraform and EKS for managing cloud resources
Apache License 2.0
Error during initial install in terraform #18

Closed michaelerobertsjr closed 3 years ago

michaelerobertsjr commented 3 years ago

During install on AWS EKS I am receiving the following error:

 Warning: Helm release "jx-git-operator" was created but has a failed status. Use the `helm` command to investigate the error, correct it, then run Terraform again.
│   with module.eks-jx.module.cluster.helm_release.jx-git-operator[0],
│   on .terraform\modules\eks-jx\modules\cluster\charts.tf line 1, in resource "helm_release" "jx-git-operator":
│    1: resource "helm_release" "jx-git-operator" {
│ Error: timed out waiting for the condition
│   with module.eks-jx.module.cluster.helm_release.jx-git-operator[0],
│   on .terraform\modules\eks-jx\modules\cluster\charts.tf line 1, in resource "helm_release" "jx-git-operator":
│    1: resource "helm_release" "jx-git-operator" {

ankitm123 commented 3 years ago

What are you seeing in the jx-git-operator namespace? Can you check if the pods are getting created? Specifically there should be a jx-git-operator-xxxx pod created? If it's not in running state, can you describe the pods and paste the output?

michaelerobertsjr commented 3 years ago
NAMESPACE         NAME                                  READY   STATUS             RESTARTS   AGE
jx-git-operator   jx-git-operator-7bc44fc4c-s769n       0/1     CrashLoopBackOff   6          10m
jx-vault          vault-0                               3/3     Running            0          48m
jx-vault          vault-configurer-558d9948dc-ztwdg     1/1     Running            0          48m
jx-vault          vault-operator-664475f76b-2lxhn       1/1     Running            0          50m
kube-system       aws-node-vwv8p                        1/1     Running            0          49m
kube-system       aws-node-wjkqz                        1/1     Running            0          49m
kube-system       aws-node-zbkx5                        1/1     Running            0          49m
kube-system       coredns-85d5b4454c-kbdnt              1/1     Running            0          53m
kube-system       coredns-85d5b4454c-shsrx              1/1     Running            0          53m
kube-system       kube-proxy-9z64j                      1/1     Running            0          49m
kube-system       kube-proxy-g5qjd                      1/1     Running            0          49m
kube-system       kube-proxy-wq6m9                      1/1     Running            0          49m
kuberhealthy      daemonset-1630527070                  0/1     Completed          0          48m
kuberhealthy      daemonset-1630527966                  0/1     Completed          0          33m
kuberhealthy      daemonset-1630528868                  0/1     Completed          0          18m
kuberhealthy      daemonset-1630529766                  0/1     Completed          0          3m54s
kuberhealthy      deployment-1630527071                 0/1     Completed          0          48m
kuberhealthy      deployment-1630527666                 0/1     Completed          0          38m
kuberhealthy      deployment-1630528267                 0/1     Completed          0          28m
kuberhealthy      deployment-1630528867                 0/1     Completed          0          18m
kuberhealthy      deployment-1630529466                 0/1     Completed          0          8m54s
kuberhealthy      dns-status-internal-1630529346        0/1     Completed          0          10m
kuberhealthy      dns-status-internal-1630529466        0/1     Completed          0          8m54s
kuberhealthy      dns-status-internal-1630529586        0/1     Completed          0          6m54s
kuberhealthy      dns-status-internal-1630529706        0/1     Completed          0          4m54s
kuberhealthy      dns-status-internal-1630529826        0/1     Completed          0          2m54s
kuberhealthy      dns-status-internal-1630529946        0/1     Completed          0          54s
kuberhealthy      jx-pod-status-1630528566              0/1     Completed          0          23m
kuberhealthy      jx-pod-status-1630528867              0/1     Completed          0          18m
kuberhealthy      jx-pod-status-1630529166              0/1     Completed          0          13m
kuberhealthy      jx-pod-status-1630529466              0/1     Completed          0          8m54s
kuberhealthy      jx-pod-status-1630529766              0/1     Completed          0          3m54s
kuberhealthy      jx-secrets-1630529646                 0/1     Completed          0          5m54s
kuberhealthy      jx-secrets-1630529706                 0/1     Completed          0          4m54s
kuberhealthy      jx-secrets-1630529767                 0/1     Completed          0          3m52s
kuberhealthy      jx-secrets-1630529826                 0/1     Completed          0          2m54s
kuberhealthy      jx-secrets-1630529886                 0/1     Completed          0          114s
kuberhealthy      jx-secrets-1630529946                 0/1     Completed          0          54s
kuberhealthy      kuberhealthy-84645cf59c-pd9ch         1/1     Running            0          50m
kuberhealthy      kuberhealthy-84645cf59c-sdrhv         1/1     Running            0          50m
kuberhealthy      network-connection-check-1630527070   0/1     Completed          0          48m
kuberhealthy      network-connection-check-1630528868   0/1     Completed          0          18m
kuberhealthy      pod-restarts-1630528566               0/1     Completed          0          23m
kuberhealthy      pod-restarts-1630528866               0/1     Completed          0          18m
kuberhealthy      pod-restarts-1630529166               0/1     Completed          0          13m
kuberhealthy      pod-restarts-1630529467               0/1     Completed          0          8m52s
kuberhealthy      pod-restarts-1630529766               0/1     Completed          0          3m53s

ankitm123 commented 3 years ago
jx-git-operator   jx-git-operator-7bc44fc4c-s769n       0/1     CrashLoopBackOff   6          10m

Can you describe this pod, and paste the output? kubectl describe pod jx-git-operator-7bc44fc4c-s769n -n jx-git-operator

ankitm123 commented 3 years ago

Also can you try 1.15.41 (set it here: https://github.com/jx3-gitops-repositories/jx3-terraform-eks/blob/main/main.tf#L8) and see if you are getting the same error? I will update the documentation to ask end users to use latest version published here: https://github.com/jenkins-x/terraform-aws-eks-jx/releases. You dont have to do a terraform destroy, just a terraform init, and then terraform apply.

michaelerobertsjr commented 3 years ago

Here is the output from the pod. I've was able to get it to complete with version 1.15.41 version.

Name:         jx-git-operator-7bc44fc4c-s769n
Namespace:    jx-git-operator
Priority:     0
Node:         ip-10-0-2-188.us-west-2.compute.internal/
Start Time:   Wed, 01 Sep 2021 13:49:49 -0700
Labels:       app=jx-git-operator
Annotations:  kubernetes.io/psp: eks.privileged
Status:       Running
Controlled By:  ReplicaSet/jx-git-operator-7bc44fc4c
    Container ID:  docker://9d380b0b2dde3542013ba4bdd2697c54ced866d68672105d43f9c8d092383e24
    Image:         ghcr.io/jenkins-x/jx-git-operator:0.0.194
    Image ID:      docker-pullable://ghcr.io/jenkins-x/jx-git-operator@sha256:d4185fb2bf98595dd220ece334f0b8160d39fd9c6ec736cfb8cb652a1cf57fb5
    Port:          <none>
    Host Port:     <none>
      echo 'no custom git initialisation scripts'; jx-git-operator
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 01 Sep 2021 15:07:06 -0700
      Finished:     Wed, 01 Sep 2021 15:07:06 -0700
    Ready:          False
    Restart Count:  20
      cpu:     100m
      memory:  256Mi
      cpu:     80m
      memory:  128Mi
    Environment Variables from:
      jx-boot-job-env-vars  Secret  Optional: true
      NO_RESOURCE_APPLY:  true
      POLL_DURATION:      20s
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kklwr (ro)
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
  Type     Reason   Age                  From     Message
  ----     ------   ----                 ----     -------
  Warning  BackOff  96s (x368 over 81m)  kubelet  Back-off restarting failed container

ankitm123 commented 3 years ago

I've was able to get it to complete with version 1.15.41 version

Then we can close this issue?

michaelerobertsjr commented 3 years ago

Yes, I believe I used the wrong version because that was what was in the repository and I was not aware it needed to be updated.