gravitational / gravity

Kubernetes application deployments for restricted, regulated, or remote environments
Apache License 2.0
1.08k stars 109 forks source link

Site Post Install Hook Failing to get healthcheck #353

Closed gevans22 closed 5 years ago

gevans22 commented 5 years ago

Install repeatedly fails and hangs on "site-app-post-install" job.

Version: 5.5.3 Environment: 3 RHEL 7.6 EC2 VMs on AWS

Install:

Thu Mar 28 15:06:56 UTC Connecting to cluster
Thu Mar 28 15:06:57 UTC Auto-loaded kernel module: br_netfilter
Thu Mar 28 15:06:57 UTC Auto-loaded kernel module: iptable_nat
Thu Mar 28 15:06:57 UTC Auto-loaded kernel module: iptable_filter
Thu Mar 28 15:06:57 UTC Auto-loaded kernel module: ebtables
Thu Mar 28 15:06:57 UTC Auto-loaded kernel module: overlay
Thu Mar 28 15:06:57 UTC Auto-set kernel parameter: net.ipv4.ip_forward=1
Thu Mar 28 15:06:57 UTC Auto-set kernel parameter: net.bridge.bridge-nf-call-iptables=1
Thu Mar 28 15:06:57 UTC Auto-set kernel parameter: fs.may_detach_mounts=1
Thu Mar 28 15:06:57 UTC Connected to installer at https://IP:61009
Thu Mar 28 15:06:57 UTC Operation has been created
Thu Mar 28 15:07:44 UTC All servers are up
Thu Mar 28 15:07:45 UTC Configure packages for all nodes
Thu Mar 28 15:07:53 UTC Bootstrap all nodes
Thu Mar 28 15:07:54 UTC Bootstrap master node VMIP.eu-central-1.compute.internal
Thu Mar 28 15:07:59 UTC Pull packages on master node VMIP.eu-central-1.compute.internal
Thu Mar 28 15:09:18 UTC Install system software on master nodes
Thu Mar 28 15:09:19 UTC Install system software on master node VMIP.eu-central-1.compute.internal
Thu Mar 28 15:09:21 UTC Install system package teleport:3.0.5 on master node VMIP.eu-central-1.compute.internal
Thu Mar 28 15:09:23 UTC Install system package planet:5.5.14-11305 on master node VMIP.eu-central-1.compute.internal
Thu Mar 28 15:09:27 UTC Install system package planet:5.5.14-11305 on master node VMIP.eu-central-1.compute.internal
Thu Mar 28 15:10:02 UTC Wait for kubernetes to become available
Thu Mar 28 15:10:22 UTC Bootstrap Kubernetes roles and PSPs
Thu Mar 28 15:10:25 UTC Configure CoreDNS
Thu Mar 28 15:10:27 UTC Populate Docker registry on master node VMIP.eu-central-1.compute.internal
Thu Mar 28 15:11:37 UTC Wait for cluster to pass health checks
Thu Mar 28 15:12:19 UTC Install system application dns-app:0.3.0
Thu Mar 28 15:12:35 UTC Install system application logging-app:5.0.2
Thu Mar 28 15:12:41 UTC Install system application monitoring-app:5.5.0
Thu Mar 28 15:12:51 UTC Install system application tiller-app:5.5.1
Thu Mar 28 15:13:04 UTC Install system application site:5.5.3
Thu Mar 28 15:20:35 UTC Operation failure: rpc error: code = Unknown desc = exit status 255
Failed to join the cluster

---
Agent process will keep running so you can re-run certain steps.
Once no longer needed, this process can be shutdown using Ctrl-C.

/var/log/gravity-install.log

clusterrole.rbac.authorization.k8s.io/gravity-site created
clusterrolebinding.rbac.authorization.k8s.io/gravity-site created
daemonset.extensions/gravity-site created
service/gravity-site created
role.rbac.authorization.k8s.io/gravity-site created
rolebinding.rbac.authorization.k8s.io/gravity-site created
Pod "gravity-install-8c711b-v5zv6" in namespace "kube-system", has changed state from "Running" to "Succeeded".
Container "gravity-install" changed status from "running" to "terminated, exit code 0".

Thu Mar 28 15:13:37 UTC [INFO] [IP.eu-central-1.compute.internal] Executing postInstall hook for site:5.5.3.
Created Pod "site-app-post-install-4dbe50-mvwgf" in namespace "kube-system".

Container "post-install-hook" created, current state is "waiting, reason PodInitializing".

Pod "site-app-post-install-4dbe50-mvwgf" in namespace "kube-system", has changed state from "Pending" to "Running".
Container "post-install-hook" changed status from "waiting, reason PodInitializing" to "running".

[ERROR]: failed connecting to https://gravity-site.kube-system.svc.cluster.local:3009/healthz
Container "post-install-hook" changed status from "running" to "terminated, exit code 255".

Container "post-install-hook" restarted, current state is "running".

[ERROR]: failed connecting to https://gravity-site.kube-system.svc.cluster.local:3009/healthz
Container "post-install-hook" changed status from "running" to "terminated, exit code 255".

Container "post-install-hook" changed status from "terminated, exit code 255" to "waiting, reason CrashLoopBackOff".

Container "post-install-hook" restarted, current state is "running".

[ERROR]: failed connecting to https://gravity-site.kube-system.svc.cluster.local:3009/healthz
Container "post-install-hook" changed status from "running" to "terminated, exit code 255".

Container "post-install-hook" changed status from "terminated, exit code 255" to "waiting, reason CrashLoopBackOff".

Container "post-install-hook" restarted, current state is "running".

[ERROR]: failed connecting to https://gravity-site.kube-system.svc.cluster.local:3009/healthz
Container "post-install-hook" changed status from "running" to "terminated, exit code 255".

Container "post-install-hook" changed status from "terminated, exit code 255" to "waiting, reason CrashLoopBackOff".

Container "post-install-hook" restarted, current state is "running".

[ERROR]: failed connecting to https://gravity-site.kube-system.svc.cluster.local:3009/healthz
Container "post-install-hook" changed status from "running" to "terminated, exit code 255".

Container "post-install-hook" changed status from "terminated, exit code 255" to "waiting, reason CrashLoopBackOff".

Container "post-install-hook" restarted, current state is "running".

[ERROR]: failed connecting to https://gravity-site.kube-system.svc.cluster.local:3009/healthz
Container "post-install-hook" changed status from "running" to "terminated, exit code 255".

Container "post-install-hook" changed status from "terminated, exit code 255" to "waiting, reason CrashLoopBackOff".

Container "post-install-hook" restarted, current state is "running".

Thu Mar 28 15:20:34 UTC [ERROR] [IP.eu-central-1.compute.internal] Phase execution failed: Job has reached the specified backoff limit, gravitational.io/site:5.5.3 postInstall hook failed.
--

Occurs with both UI and command line installer. I can access the healthz endpoint in gravity shell on all 3 machines.

Also when in the gravity-site container (via exec -it) gravity site shell returns "site is up and running"

Running install as a bare metal install.


Installing with 5.3.5 vs 5.5.3 gets two different DNS lookups for the host VM ip in the gravity site container.

5.5.3 / $ cat /etc/resolv.conf nameserver 127.0.0.2 nameserver 172.31.0.2 search eu-central-1.compute.internal options ndots:2 timeout:1 attempts:2

/ $ nslookup 172.31.XX.XX Server: 127.0.0.2 Address 1: 127.0.0.2

Name: 172.31.XX.XX Address 1: 172.31.XX.XX 172-31-XX-XX.gravity-site.kube-system.svc.cluster.local

5.3.5 / $ cat /etc/resolv.conf nameserver 127.0.0.2 nameserver 172.31.0.2 search eu-central-1.compute.internal options ndots:2 timeout:1 attempts:2

/ $ nslookup 172.31.XX.XX Server: 127.0.0.2 Address 1: 127.0.0.2

Name: 172.31.XX.XX Address 1: 172.31.XX.XX ip-172-31-XX-XX.eu-central-1.compute.internal

-- replicated the job container and exec into it cannot resolve proper dns of the cluster

/ # nslookup gravity-site.kube-system.svc.cluster.local
Server:    10.100.14.135
Address 1: 10.100.14.135

nslookup: can't resolve 'gravity-site.kube-system.svc.cluster.local'
/ # 
/ # nslookup kubernetes.default
Server:    10.100.14.135
Address 1: 10.100.14.135

nslookup: can't resolve 'kubernetes.default'
knisbet commented 5 years ago

When asking for troubleshooting assistance, please use the gravity community site (https://community.gravitational.com). We'd like to generally leave the github issue tracker for bug tracking.

To try and help you troubleshoot, the only time I've ever seen this failure, is when the upstream DNS server is unavailable (in this case 172.31.0.2). Because the resolv.conf on your host contains (search eu-central-1.compute.internal), when looking up gravity-site.kube-system.svc.cluster.local the first DNS query will be to gravity-site.kube-system.svc.cluster.local.eu-central-1.compute.internal which if the upstream DNS server is unreachable will time out, before the second attempt to resolv without the search domain.

It's also possible that CoreDNS has failed or the service is unavailable. To test this, try and resolv kubernetes.default.svc.cluster.local. from within a pod (Note the trailing . to root the query and avoid invoking the search domain). If that fails, I would look into why CoreDNS isn't running as pods, or why the service network isn't routing to the coredns pods.

dunefro commented 4 years ago

Encountered the same problem in non-airgapped environment. We are trying to work out the multiple helm charts scenarios without any post-install hook. Still not sure where to look for the issue.

Logs of gravity-install

Wed Dec 18 05:05:57 UTC [INFO] Operation has been created
Wed Dec 18 05:13:17 UTC [INFO] Waiting for the provisioned nodes to come up
Wed Dec 18 05:13:17 UTC [INFO] All servers are up
Wed Dec 18 05:14:07 UTC [INFO] Executing phase: /configure.
Wed Dec 18 05:14:07 UTC [INFO] Configuring cluster packages.
Wed Dec 18 05:14:12 UTC [INFO] [aws08] Executing phase: /bootstrap/aws08.
Wed Dec 18 05:14:12 UTC [INFO] [aws08] Configuring system directories.
Wed Dec 18 05:14:12 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/local/packages/blobs.
Wed Dec 18 05:14:12 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/local/packages/unpacked.
Wed Dec 18 05:14:12 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/local/packages/tmp.
Wed Dec 18 05:14:12 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/teleport/auth.
Wed Dec 18 05:14:12 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/teleport/node.
Wed Dec 18 05:14:12 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/planet/state.
Wed Dec 18 05:14:12 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/planet/etcd.
Wed Dec 18 05:14:13 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/planet/registry.
Wed Dec 18 05:14:13 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/planet/docker.
Wed Dec 18 05:14:13 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/planet/share/hooks.
Wed Dec 18 05:14:13 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/planet/log/journal.
Wed Dec 18 05:14:13 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/site/teleport.
Wed Dec 18 05:14:13 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/site/packages/unpacked.
Wed Dec 18 05:14:13 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/site/packages/blobs.
Wed Dec 18 05:14:13 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/site/packages/tmp.
Wed Dec 18 05:14:13 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/secrets.
Wed Dec 18 05:14:13 UTC [INFO] [aws08] Creating system directory /var/lib/gravity/backup.
Wed Dec 18 05:14:14 UTC [INFO] [aws08] Setting ownership on system directory /var/lib/gravity to 1000:1000.
Wed Dec 18 05:14:14 UTC [INFO] [aws08] Setting ownership on system directory /var/lib/gravity/local to 1000:1000.
Wed Dec 18 05:14:14 UTC [INFO] [aws08] Setting ownership on system directory /var/lib/gravity/teleport to 1000:1000.
Wed Dec 18 05:14:14 UTC [INFO] [aws08] Setting ownership on system directory /var/lib/gravity/planet to 1000:1000.
Wed Dec 18 05:14:14 UTC [INFO] [aws08] Setting ownership on system directory /var/lib/gravity/site to 1000:1000.
Wed Dec 18 05:14:14 UTC [INFO] [aws08] Setting ownership on system directory /var/lib/gravity/secrets to 1000:1000.
Wed Dec 18 05:14:14 UTC [INFO] [aws08] Setting ownership on system directory /var/lib/gravity/backup to 1000:1000.
Wed Dec 18 05:14:14 UTC [INFO] [aws08] Setting mode on system directory /var/lib/gravity to -rwxr-xr-x.
Wed Dec 18 05:14:14 UTC [INFO] [aws08] Setting mode on system directory /var/lib/gravity/local to -rwxr-xr-x.
Wed Dec 18 05:14:15 UTC [INFO] [aws08] Configuring application-specific volumes.
Wed Dec 18 05:14:15 UTC [INFO] [aws08] Created agent user adminagent@Aspace.
Wed Dec 18 05:14:15 UTC [INFO] [aws08] Created DNS configuration: 127.0.0.2:53.
Wed Dec 18 05:14:16 UTC [INFO] [aws08] Executing phase: /pull/aws08.
Wed Dec 18 05:14:16 UTC [INFO] [aws08] Pulling user application.
Wed Dec 18 05:14:17 UTC [INFO] [aws08] Pulling package gravitational.io/gravity:6.2.5.
Wed Dec 18 05:14:17 UTC [INFO] [aws08] Pulling package gravitational.io/web-assets:6.2.5.
Wed Dec 18 05:14:18 UTC [INFO] [aws08] Pulling package gravitational.io/teleport:3.2.13.
Wed Dec 18 05:14:18 UTC [INFO] [aws08] Pulling package gravitational.io/planet:6.2.4-11603.
Wed Dec 18 05:14:21 UTC [INFO] [aws08] Pulling application gravitational.io/rbac-app:6.2.5.
Wed Dec 18 05:14:22 UTC [INFO] [aws08] Pulling application gravitational.io/dns-app:0.3.0.
Wed Dec 18 05:14:22 UTC [INFO] [aws08] Pulling application gravitational.io/bandwagon:6.0.1.
Wed Dec 18 05:14:23 UTC [INFO] [aws08] Pulling application gravitational.io/logging-app:6.0.2.
Wed Dec 18 05:14:24 UTC [INFO] [aws08] Pulling application gravitational.io/monitoring-app:6.0.4.
Wed Dec 18 05:14:26 UTC [INFO] [aws08] Pulling application gravitational.io/tiller-app:6.2.0.
Wed Dec 18 05:14:27 UTC [INFO] [aws08] Pulling application gravitational.io/site:6.2.5.
Wed Dec 18 05:14:29 UTC [INFO] [aws08] Pulling application gravitational.io/kubernetes:6.2.5.
Wed Dec 18 05:14:29 UTC [INFO] [aws08] Pulling application gravitational.io/sygmoid-installer:2.2.0.
Wed Dec 18 05:14:32 UTC [INFO] [aws08] Pulling configured packages.
Wed Dec 18 05:14:32 UTC [INFO] [aws08] Unpacking secrets into /var/lib/gravity/secrets.
Wed Dec 18 05:14:34 UTC [INFO] [aws08] Marking installed package: Aspace/planet-172.31.40.31-secrets:6.2.4-11603.
Wed Dec 18 05:14:34 UTC [INFO] [aws08] Marking installed package: Aspace/planet-config-172314031Aspace:6.2.4-11603.
Wed Dec 18 05:14:34 UTC [INFO] [aws08] Marking installed package: Aspace/teleport-node-config-172314031Aspace:3.2.13.
Wed Dec 18 05:14:34 UTC [INFO] [aws08] Marking installed package: gravitational.io/gravity:6.2.5.
Wed Dec 18 05:14:34 UTC [INFO] [aws08] Marking installed package: gravitational.io/teleport:3.2.13.
Wed Dec 18 05:14:34 UTC [INFO] [aws08] Marking runtime package: gravitational.io/planet:6.2.4-11603.
Wed Dec 18 05:14:34 UTC [INFO] [aws08] Unpacking pulled packages.
Wed Dec 18 05:14:34 UTC [INFO] [aws08] Unpacking package gravitational.io/planet:6.2.4-11603.
Wed Dec 18 05:14:54 UTC [INFO] [aws08] Unpacking package Aspace/cert-authority:0.0.1.
Wed Dec 18 05:14:55 UTC [INFO] [aws08] Unpacking package Aspace/planet-172.31.40.31-secrets:6.2.4-11603.
Wed Dec 18 05:14:55 UTC [INFO] [aws08] Unpacking package Aspace/planet-config-172314031Aspace:6.2.4-11603.
Wed Dec 18 05:14:55 UTC [INFO] [aws08] Unpacking package Aspace/teleport-master-config-172314031Aspace:3.2.13.
Wed Dec 18 05:14:55 UTC [INFO] [aws08] Unpacking package Aspace/teleport-node-config-172314031Aspace:3.2.13.
Wed Dec 18 05:14:55 UTC [INFO] [aws08] Unpacking package gravitational.io/teleport:3.2.13.
Wed Dec 18 05:14:56 UTC [INFO] [aws08] Unpacking package gravitational.io/web-assets:6.2.5.
Wed Dec 18 05:14:58 UTC [INFO] [aws08] Executing phase: /masters/aws08/teleport.
Wed Dec 18 05:14:58 UTC [INFO] [aws08] Installing system service teleport:3.2.13
Wed Dec 18 05:15:01 UTC [INFO] [aws08] Executing phase: /masters/aws08/planet.
Wed Dec 18 05:15:01 UTC [INFO] [aws08] Installing system service planet:6.2.4-11603
Wed Dec 18 05:15:22 UTC [INFO] [aws08] Executing phase: /wait.
Wed Dec 18 05:15:23 UTC [INFO] [aws08] Waiting for Kubernetes API to start: Get https://leader.telekube.local:6443/api/v1/componentstatuses/scheduler: cannot resolve non-cluster local address
Wed Dec 18 05:15:24 UTC [INFO] [aws08] Waiting for Kubernetes API to start: Get https://leader.telekube.local:6443/api/v1/componentstatuses/scheduler: cannot resolve non-cluster local address
Wed Dec 18 05:15:25 UTC [INFO] [aws08] Waiting for Kubernetes API to start: Get https://leader.telekube.local:6443/api/v1/componentstatuses/scheduler: cannot resolve non-cluster local address
Wed Dec 18 05:15:26 UTC [INFO] [aws08] Waiting for Kubernetes API to start: Get https://leader.telekube.local:6443/api/v1/componentstatuses/scheduler: cannot resolve non-cluster local address
Wed Dec 18 05:15:27 UTC [INFO] [aws08] Waiting for Kubernetes API to start: Get https://leader.telekube.local:6443/api/v1/componentstatuses/scheduler: cannot resolve non-cluster local address
Wed Dec 18 05:15:28 UTC [INFO] [aws08] Waiting for Kubernetes API to start: Get https://leader.telekube.local:6443/api/v1/componentstatuses/scheduler: cannot resolve non-cluster local address
Wed Dec 18 05:15:29 UTC [INFO] [aws08] Waiting for Kubernetes API to start: Get https://leader.telekube.local:6443/api/v1/componentstatuses/scheduler: cannot resolve non-cluster local address
Wed Dec 18 05:15:30 UTC [INFO] [aws08] Waiting for Kubernetes API to start: Get https://leader.telekube.local:6443/api/v1/componentstatuses/scheduler: cannot resolve non-cluster local address
Wed Dec 18 05:15:31 UTC [INFO] [aws08] Waiting for Kubernetes API to start: Get https://leader.telekube.local:6443/api/v1/componentstatuses/scheduler: cannot resolve non-cluster local address
Wed Dec 18 05:15:32 UTC [INFO] [aws08] Waiting for Kubernetes API to start: Get https://leader.telekube.local:6443/api/v1/componentstatuses/scheduler: cannot resolve non-cluster local address
Wed Dec 18 05:15:39 UTC [DEBUG] [aws08] Kubernetes API is available.
Wed Dec 18 05:15:39 UTC [DEBUG] [aws08] Kube-system namespace is available.
Wed Dec 18 05:15:39 UTC [INFO] [aws08] Kubernetes API is available.
Wed Dec 18 05:15:40 UTC [INFO] [aws08] Executing phase: /rbac.
Wed Dec 18 05:15:42 UTC [INFO] [aws08] Created Kubernetes RBAC resources.
Wed Dec 18 05:15:43 UTC [INFO] Executing phase: /coredns.
Wed Dec 18 05:15:43 UTC [INFO] Configuring CoreDNS.
Wed Dec 18 05:15:44 UTC [INFO] Executing phase: /system-resources.
Wed Dec 18 05:15:45 UTC [INFO] Configuring system Kubernetes resources.
Wed Dec 18 05:15:45 UTC [INFO] Created cluster-info config map.
Wed Dec 18 05:15:46 UTC [INFO] [aws08] Executing phase: /export/aws08.
Wed Dec 18 05:15:46 UTC [INFO] [aws08] Exporting application rbac-app:6.2.5 to local registry.
Wed Dec 18 05:15:48 UTC [INFO] [aws08] Exporting application dns-app:0.3.0 to local registry.
Wed Dec 18 05:15:50 UTC [INFO] [aws08] Exporting application bandwagon:6.0.1 to local registry.
Wed Dec 18 05:15:53 UTC [INFO] [aws08] Exporting application logging-app:6.0.2 to local registry.
Wed Dec 18 05:16:00 UTC [INFO] [aws08] Exporting application monitoring-app:6.0.4 to local registry.
Wed Dec 18 05:16:11 UTC [INFO] [aws08] Exporting application tiller-app:6.2.0 to local registry.
Wed Dec 18 05:16:13 UTC [INFO] [aws08] Exporting application site:6.2.5 to local registry.
Wed Dec 18 05:16:22 UTC [INFO] [aws08] Exporting application sygmoid-installer:2.2.0 to local registry.
Wed Dec 18 05:16:31 UTC [INFO] [aws08] Application gravitational.io/sygmoid-installer:2.2.0 exported.
Wed Dec 18 05:16:32 UTC [INFO] [aws08] Executing phase: /health.
Wed Dec 18 05:16:32 UTC [INFO] [aws08] Waiting for the planet to start.
Wed Dec 18 05:16:32 UTC [INFO] [aws08] Planet is running.
Wed Dec 18 05:16:33 UTC [INFO] [aws08] Executing phase: /runtime/dns-app.
Wed Dec 18 05:16:33 UTC [INFO] [aws08] Executing install hook for dns-app:0.3.0.
Created Pod "dns-app-install-a5461a-frgvq" in namespace "kube-system".

Container "hooks" created, current state is "waiting, reason PodInitializing".

Pod "dns-app-install-a5461a-frgvq" in namespace "kube-system", has changed state from "Pending" to "Running".
Container "hooks" changed status from "waiting, reason PodInitializing" to "running".

+ echo Assuming changeset from the environment: dns-030
Assuming changeset from the environment: dns-030
Creating new resources
+ [ install = update ]
+ [ install = rollback ]
+ [ install = install ]
+ echo Creating new resources
+ rig upsert -f /var/lib/gravity/resources/dns.yaml
changeset dns-030 updated 
+ echo Freezing
+ rig freeze
Freezing
changeset dns-030 frozen, no further modifications are allowed
Pod "dns-app-install-a5461a-frgvq" in namespace "kube-system", has changed state from "Running" to "Succeeded".
Container "hooks" changed status from "running" to "terminated, exit code 0".

<unknown> has completed, 15 seconds elapsed.
Wed Dec 18 05:16:49 UTC [DEBUG] [aws08] Application gravitational.io/dns-app:0.3.0 does not have postInstall hook.
Wed Dec 18 05:16:50 UTC [INFO] [aws08] Executing phase: /runtime/logging-app.
Wed Dec 18 05:16:50 UTC [INFO] [aws08] Executing install hook for logging-app:6.0.2.
Created Pod "logging-app-install-0bd917-lcbzg" in namespace "kube-system".

Container "hook" created, current state is "waiting, reason PodInitializing".

Pod "logging-app-install-0bd917-lcbzg" in namespace "kube-system", has changed state from "Pending" to "Running".
Container "hook" changed status from "waiting, reason PodInitializing" to "running".

service/log-collector created
configmap/log-forwarders created
configmap/log-collector created
deployment.extensions/log-collector created
service/lr-aggregator created
configmap/lr-aggregator created
deployment.extensions/lr-aggregator created
configmap/lr-collector created
daemonset.extensions/lr-collector created
configmap/lr-forwarder created
deployment.extensions/lr-forwarder created
Pod "logging-app-install-0bd917-lcbzg" in namespace "kube-system", has changed state from "Running" to "Succeeded".
Container "hook" changed status from "running" to "terminated, exit code 0".

<unknown> has completed, 7 seconds elapsed.
Wed Dec 18 05:16:57 UTC [DEBUG] [aws08] Application gravitational.io/logging-app:6.0.2 does not have postInstall hook.
Wed Dec 18 05:16:58 UTC [INFO] [aws08] Executing phase: /runtime/monitoring-app.
Wed Dec 18 05:16:58 UTC [INFO] [aws08] Executing install hook for monitoring-app:6.0.4.
Created Pod "monitoring-app-install-2436f0-cgx6z" in namespace "kube-system".

Container "hook" created, current state is "waiting, reason PodInitializing".

Pod "monitoring-app-install-2436f0-cgx6z" in namespace "kube-system", has changed state from "Pending" to "Running".
Container "hook" changed status from "waiting, reason PodInitializing" to "running".

namespace/monitoring created
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
serviceaccount/monitoring created
serviceaccount/monitoring-updater created
clusterrole.rbac.authorization.k8s.io/monitoring:metrics created
clusterrolebinding.rbac.authorization.k8s.io/monitoring:metrics created
role.rbac.authorization.k8s.io/monitoring created
rolebinding.rbac.authorization.k8s.io/monitoring created
role.rbac.authorization.k8s.io/monitoring:updater created
rolebinding.rbac.authorization.k8s.io/monitoring-updater created
configmap/grafana-cfg created
deployment.apps/grafana created
service/grafana created
servicemonitor.monitoring.coreos.com/grafana created
configmap/grafana-dashboard-k8s-cluster-rsrc-use created
configmap/grafana-dashboard-k8s-resources-cluster created
configmap/grafana-dashboard-k8s-resources-namespace created
configmap/grafana-dashboard-k8s-resources-pod created
configmap/grafana-dashboard-k8s-resources-workload created
configmap/grafana-dashboard-k8s-resources-workloads-namespace created
configmap/grafana-dashboard-nodes created
configmap/grafana-dashboard-pods created
configmap/grafana-dashboards created
secret/grafana created
secret/grafana-datasources created
deployment.apps/watcher created
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
deployment.apps/prometheus-operator created
service/prometheus-operator created
serviceaccount/prometheus-operator created
servicemonitor.monitoring.coreos.com/prometheus-operator created
secret/prometheus-additional-scrape-configs created
alertmanager.monitoring.coreos.com/main created
role.rbac.authorization.k8s.io/alertmanager-main created
rolebinding.rbac.authorization.k8s.io/alertmanager-main created
secret/alertmanager-main created
service/alertmanager-main created
serviceaccount/alertmanager-main created
servicemonitor.monitoring.coreos.com/alertmanager created
clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
deployment.apps/kube-state-metrics created
role.rbac.authorization.k8s.io/kube-state-metrics created
rolebinding.rbac.authorization.k8s.io/kube-state-metrics created
service/kube-state-metrics created
serviceaccount/kube-state-metrics created
servicemonitor.monitoring.coreos.com/kube-state-metrics created
clusterrole.rbac.authorization.k8s.io/node-exporter created
clusterrolebinding.rbac.authorization.k8s.io/node-exporter created
daemonset.apps/node-exporter created
service/node-exporter created
serviceaccount/node-exporter created
servicemonitor.monitoring.coreos.com/node-exporter created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
clusterrole.rbac.authorization.k8s.io/prometheus-adapter created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-adapter created
clusterrolebinding.rbac.authorization.k8s.io/resource-metrics:system:auth-delegator created
clusterrole.rbac.authorization.k8s.io/resource-metrics-server-resources created
configmap/adapter-config created
deployment.apps/prometheus-adapter created
rolebinding.rbac.authorization.k8s.io/resource-metrics-auth-reader created
service/prometheus-adapter created
serviceaccount/prometheus-adapter created
clusterrole.rbac.authorization.k8s.io/prometheus-k8s created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-k8s created
prometheus.monitoring.coreos.com/k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s-config created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s-config created
role.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s created
prometheusrule.monitoring.coreos.com/prometheus-k8s-rules created
service/prometheus-k8s created
serviceaccount/prometheus-k8s created
servicemonitor.monitoring.coreos.com/prometheus created
servicemonitor.monitoring.coreos.com/kube-apiserver created
servicemonitor.monitoring.coreos.com/coredns created
servicemonitor.monitoring.coreos.com/kubelet created
Pod "monitoring-app-install-2436f0-cgx6z" in namespace "kube-system", has changed state from "Running" to "Succeeded".
Container "hook" changed status from "running" to "terminated, exit code 0".

<unknown> has completed, 25 seconds elapsed.
Wed Dec 18 05:17:23 UTC [DEBUG] [aws08] Application gravitational.io/monitoring-app:6.0.4 does not have postInstall hook.
Wed Dec 18 05:17:24 UTC [INFO] [aws08] Executing phase: /runtime/tiller-app.
Wed Dec 18 05:17:24 UTC [INFO] [aws08] Executing install hook for tiller-app:6.2.0.
Created Pod "tiller-app-bootstrap-8ce0ae-rtzkp" in namespace "kube-system".

Container "hook" created, current state is "waiting, reason PodInitializing".

Pod "tiller-app-bootstrap-8ce0ae-rtzkp" in namespace "kube-system", has changed state from "Pending" to "Running".
Container "hook" changed status from "waiting, reason PodInitializing" to "running".

serviceaccount/tiller created
clusterrolebinding.rbac.authorization.k8s.io/tiller created
deployment.apps/tiller-deploy created
Pod "tiller-app-bootstrap-8ce0ae-rtzkp" in namespace "kube-system", has changed state from "Running" to "Succeeded".
Container "hook" changed status from "running" to "terminated, exit code 0".

Wed Dec 18 05:17:44 UTC [DEBUG] [aws08] Application gravitational.io/tiller-app:6.2.0 does not have postInstall hook.
Wed Dec 18 05:17:45 UTC [INFO] [aws08] Executing phase: /runtime/site.
Wed Dec 18 05:17:45 UTC [INFO] [aws08] Executing install hook for site:6.2.5.
Created Pod "gravity-install-fcef2f-m7jcj" in namespace "kube-system".

Container "gravity-install" created, current state is "waiting, reason PodInitializing".

Pod "gravity-install-fcef2f-m7jcj" in namespace "kube-system", has changed state from "Pending" to "Running".
Container "gravity-install" changed status from "waiting, reason PodInitializing" to "running".

2019-12-18T05:18:01Z DEBU             got search paths: [/var/lib/gravity/resources/config] processconfig/config.go:57
2019-12-18T05:18:01Z DEBU             look up configs in /var/lib/gravity/resources/config processconfig/config.go:59
2019-12-18T05:18:01Z DEBU             using ETCD backend processconfig/config.go:237
2019-12-18T05:18:01Z INFO [PROCESS]   Process ID: 172.31.40.31. mode:site process/process.go:285
2019-12-18T05:18:01Z INFO [PROCESS]   Teleport config: &service.Config{DataDir:"/var/lib/gravity/site/teleport", Hostname:"aws08", Token:"", AuthServers:[]utils.NetAddr{utils.NetAddr{Addr:"0.0.0.0:3025", AddrNetwork:"tcp", Path:""}}, Identities:[]*auth.Identity(nil), AdvertiseIP:"", CachePolicy:service.CachePolicy{Enabled:true, TTL:0, NeverExpires:false, RecentTTL:(*time.Duration)(nil)}, SSH:service.SSHConfig{Enabled:true, Addr:utils.NetAddr{Addr:"0.0.0.0:3022", AddrNetwork:"tcp", Path:""}, Namespace:"", Shell:"/bin/bash", Limiter:limiter.LimiterConfig{Rates:[]limiter.Rate(nil), MaxConnections:1000, MaxNumberOfUsers:250, Clock:timetools.TimeProvider(nil)}, Labels:map[string]string(nil), CmdLabels:services.CommandLabels(nil), PermitUserEnvironment:false, PAM:(*pam.Config)(0xc0007cd020), PublicAddrs:[]utils.NetAddr(nil)}, Auth:service.AuthConfig{Enabled:true, EnableProxyProtocol:true, SSHAddr:utils.NetAddr{Addr:"0.0.0.0:3025", AddrNetwork:"tcp", Path:""}, Authorities:[]services.CertAuthority(nil), Roles:[]services.Role(nil), ClusterName:(*services.ClusterNameV2)(0xc0007f5f10), StaticTokens:(*services.StaticTokensV2)(0xc000acb280), StorageConfig:backend.Config{Type:"dir", Params:backend.Params{"path":"/var/lib/gravity/site/teleport"}}, Limiter:limiter.LimiterConfig{Rates:[]limiter.Rate(nil), MaxConnections:1000, MaxNumberOfUsers:250, Clock:timetools.TimeProvider(nil)}, NoAudit:false, Preference:(*services.AuthPreferenceV2)(0xc0007f77c0), ClusterConfig:(*services.ClusterConfigV3)(0xc0003c6b40), LicenseFile:"/var/lib/teleport/license.pem", PublicAddrs:[]utils.NetAddr(nil)}, Keygen:sshca.Authority(nil), Proxy:service.ProxyConfig{Enabled:true, DisableTLS:false, DisableWebInterface:true, DisableWebService:true, DisableReverseTunnel:false, ReverseTunnelListenAddr:utils.NetAddr{Addr:"0.0.0.0:3024", AddrNetwork:"tcp", Path:""}, EnableProxyProtocol:true, WebAddr:utils.NetAddr{Addr:"0.0.0.0:3080", AddrNetwork:"tcp", Path:""}, SSHAddr:utils.NetAddr{Addr:"0.0.0.0:3023", AddrNetwork:"tcp", Path:""}, TLSKey:"", TLSCert:"", Limiter:limiter.LimiterConfig{Rates:[]limiter.Rate(nil), MaxConnections:1000, MaxNumberOfUsers:250, Clock:timetools.TimeProvider(nil)}, PublicAddrs:[]utils.NetAddr(nil), SSHPublicAddrs:[]utils.NetAddr(nil), Kube:service.KubeProxyConfig{Enabled:false, ListenAddr:utils.NetAddr{Addr:"0.0.0.0:3026", AddrNetwork:"tcp", Path:""}, APIAddr:utils.NetAddr{Addr:"", AddrNetwork:"", Path:""}, ClusterOverride:"", CACert:[]uint8(nil), PublicAddrs:[]utils.NetAddr(nil), KubeconfigPath:""}}, HostUUID:"", Console:(*io.PipeWriter)(0xc000010ef8), ReverseTunnels:[]services.ReverseTunnel(nil), OIDCConnectors:[]services.OIDCConnector(nil), PIDFile:"", Trust:(*usersservice.UsersService)(0xc0008f9b00), Presence:(*keyval.electingBackend)(0xc0008f9ad0), Provisioner:(*usersservice.UsersService)(0xc0008f9b00), Identity:(*usersservice.UsersService)(0xc0008f9b00), Access:(*usersservice.UsersService)(0xc0008f9b00), ClusterConfiguration:(*usersservice.UsersService)(0xc0008f9b00), CipherSuites:[]uint16{0xcca8, 0xcca9, 0xc02f, 0xc02b, 0xc030, 0xc02c, 0x9c, 0x9d}, Ciphers:[]string{"aes128-gcm@openssh.com", "aes128-ctr", "aes192-ctr", "aes256-ctr"}, KEXAlgorithms:[]string{"curve25519-sha256@libssh.org", "ecdh-sha2-nistp256", "ecdh-sha2-nistp384", "ecdh-sha2-nistp521"}, MACAlgorithms:[]string{"hmac-sha2-256-etm@openssh.com", "hmac-sha2-256"}, DiagnosticAddr:utils.NetAddr{Addr:"", AddrNetwork:"", Path:""}, Debug:false, UploadEventsC:(chan *events.UploadEvent)(nil), FileDescriptors:[]service.FileDescriptor(nil), PollingPeriod:10000000000, ClientTimeout:0, ShutdownTimeout:0, CAPin:"", Clock:clockwork.Clock(nil)}. mode:site process/process.go:302
2019-12-18T05:18:01Z INFO [PROCESS]   Gravity config: processconfig.Config{Hostname:"leader.telekube.local", Mode:"site", Profile:processconfig.ProfileConfig{HTTPEndpoint:"", OutputDir:""}, Devmode:false, ClusterName:"", WebAssetsDir:"", DataDir:"/var/lib/gravity/site", HealthAddr:utils.NetAddr{Addr:"0.0.0.0:3010", AddrNetwork:"tcp", Path:""}, BackendType:"etcd", ETCD:keyval.ETCDConfig{Clock:clockwork.Clock(nil), Nodes:[]string{"https://127.0.0.1:2379"}, Key:"/gravity/local", TLSKeyFile:"/var/lib/gravity/secrets/etcd.key", TLSCertFile:"/var/lib/gravity/secrets/etcd.cert", TLSCAFile:"/var/lib/gravity/secrets/root.cert", RetryInterval:0}, OpsCenter:processconfig.OpsCenterConfig{SeedConfig:(*ops.SeedConfig)(nil)}, Pack:processconfig.PackageServiceConfig{ListenAddr:utils.NetAddr{Addr:"0.0.0.0:3009", AddrNetwork:"tcp", Path:""}, PublicListenAddr:utils.NetAddr{Addr:"0.0.0.0:3007", AddrNetwork:"tcp", Path:""}, AdvertiseAddr:utils.NetAddr{Addr:"172.31.40.31:3009", AddrNetwork:"tcp", Path:""}, PublicAdvertiseAddr:utils.NetAddr{Addr:"", AddrNetwork:"", Path:""}, ReadDir:""}, Charts:processconfig.ChartsConfig{Backend:"local"}, Users:processconfig.Users(nil), InstallLogFiles:[]string(nil), ImportDir:"", ServiceUser:(*systeminfo.User)(0xc000a986c0), InstallToken:""}. mode:site process/process.go:303
2019-12-18T05:18:01Z DEBU [PROCESS]   Init from "/opt/gravity-import". mode:site process/process.go:464
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at Aspace/cert-authority:0.0.1. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at Aspace/planet-172.31.40.31-secrets:6.2.4-11603. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at Aspace/planet-config-172314031Aspace:6.2.4-11603. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at Aspace/site-export:0.0.1. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at Aspace/teleport-master-config-172314031Aspace:3.2.13. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at Aspace/teleport-node-config-172314031Aspace:3.2.13. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at gravitational.io/bandwagon:6.0.1. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at gravitational.io/dns-app:0.3.0. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at gravitational.io/gravity:6.2.5. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at gravitational.io/kubernetes:6.2.5. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at gravitational.io/logging-app:6.0.2. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at gravitational.io/monitoring-app:6.0.4. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at gravitational.io/planet:6.2.4-11603. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at gravitational.io/rbac-app:6.2.5. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at gravitational.io/site:6.2.5. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at gravitational.io/sygmoid-installer:2.2.0. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at gravitational.io/teleport:3.2.13. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at gravitational.io/tiller-app:6.2.0. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z INFO [IMPORTER]  Looking at gravitational.io/web-assets:6.2.5. dir:/opt/gravity-import process/import.go:89
2019-12-18T05:18:01Z DEBU [IMPORTER]  Importing cluster data. dir:/opt/gravity-import process/import.go:200
2019-12-18T05:18:01Z DEBU [IMPORTER]  Importing packages. dir:/opt/gravity-import process/import.go:164
2019-12-18T05:18:01Z DEBU [BLOB]      Got enough success writes for cb5a15b5d6011dd6488d547c3419e92de3e48f1be3183cd1b65f5cc952253c6a [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:01Z DEBU [IMPORTER]  Imported Aspace/cert-authority:0.0.1 in 85.111898ms. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:01Z DEBU [BLOB]      Got enough success writes for 03fb56007e9d649f4b9edaec1f45bb8d0d6a2a6d4a55ead5ee9a0e42ef4b572c [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:01Z DEBU [IMPORTER]  Imported Aspace/planet-172.31.40.31-secrets:6.2.4-11603 in 61.228124ms. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:01Z DEBU [BLOB]      Got enough success writes for 5f08ab7703218ed9538abe43d89de145cc63fed808cbfd9d7f74d75905d11f74 [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:01Z DEBU [IMPORTER]  Imported Aspace/planet-config-172314031Aspace:6.2.4-11603 in 98.59177ms. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:01Z DEBU [BLOB]      Got enough success writes for a4c40e0d74ca2dbbbc12d1bb2f32b45f76e3839f446217a95e4bdcab4316a8ea [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:01Z DEBU [IMPORTER]  Imported Aspace/teleport-master-config-172314031Aspace:3.2.13 in 106.194973ms. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:02Z DEBU [BLOB]      Got enough success writes for f2c63d03f61e1b65a3b0cdef1174fbf7e8930d1921ed12dd044ce0525ae5a763 [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:02Z DEBU [IMPORTER]  Imported Aspace/teleport-node-config-172314031Aspace:3.2.13 in 41.815775ms. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:03Z DEBU [BLOB]      Got enough success writes for a03a63849b666ab7656ba504cd637874f891ccb3967e12a971b83d6cd2036967 [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:03Z DEBU [IMPORTER]  Imported gravitational.io/bandwagon:6.0.1 in 1.587558364s. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:04Z DEBU [BLOB]      Got enough success writes for cb80d761c2b3c6a26e174d83a6eb8a96871cac4cd66cd5b55108bf5854dc78b6 [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:04Z DEBU [IMPORTER]  Imported gravitational.io/dns-app:0.3.0 in 908.807311ms. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:06Z DEBU [BLOB]      Got enough success writes for d94a58c2412b6c680311b30a32bc5c1689da58156d716aa5c73b3c798c87aa3a [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:06Z DEBU [IMPORTER]  Imported gravitational.io/gravity:6.2.5 in 1.930347883s. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:06Z DEBU [BLOB]      Got enough success writes for ce34ee5f7df69a20878be6d370ed8772d8c34c6d7e31421602a43c7e719b4400 [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:06Z DEBU [IMPORTER]  Imported gravitational.io/kubernetes:6.2.5 in 90.299437ms. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:07Z DEBU [BLOB]      Got enough success writes for 55e31b086464ed17a16e27a63ba83e5cb7a6e9d878e917b071f6795be8da17b8 [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:07Z DEBU [IMPORTER]  Imported gravitational.io/logging-app:6.0.2 in 1.403867378s. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:10Z DEBU [BLOB]      Got enough success writes for 7eeec679755fa0e9446e381fd4b73e3529b76643b3d58f63ecc868b8801960a1 [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:10Z DEBU [IMPORTER]  Imported gravitational.io/monitoring-app:6.0.4 in 2.937624813s. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:16Z DEBU [BLOB]      Got enough success writes for 9df345dd7aa74ae48d0c7d2fcce6627b71f3f2b461b96981d8c1f36636a03195 [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:16Z DEBU [IMPORTER]  Imported gravitational.io/planet:6.2.4-11603 in 5.207132594s. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:16Z DEBU [BLOB]      Got enough success writes for 894391cf76884a72fe6cc71140a508e7a16271a12fb17b261405f44d4d7e3585 [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:16Z DEBU [IMPORTER]  Imported gravitational.io/rbac-app:6.2.5 in 68.075505ms. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:16Z DEBU [BLOB]      Got enough success writes for 96920199b6450f7f987c939b9ff779035520856b99aa3f3983312e53396e99bb [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:16Z DEBU [IMPORTER]  Imported gravitational.io/site:6.2.5 in 435.298334ms. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:18Z DEBU [BLOB]      Got enough success writes for 2a3f093d9292850e0fb995bfc1eb54a5e6e5641bf686f46f9aec9ef88a14af66 [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:18Z DEBU [IMPORTER]  Imported gravitational.io/sygmoid-installer:2.2.0 in 1.880985352s. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:18Z DEBU [BLOB]      Got enough success writes for 32f0f175806cd35de2c50f5a771e7ff4d59d8719fc0b8777697782dd1201d38f [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:18Z DEBU [IMPORTER]  Imported gravitational.io/teleport:3.2.13 in 409.65983ms. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:19Z DEBU [BLOB]      Got enough success writes for 084390f3ca0d2dd805b91f70eb1ccb844a110cb70b13fa59e553e96235c0d049 [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:19Z DEBU [IMPORTER]  Imported gravitational.io/tiller-app:6.2.0 in 243.417898ms. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:19Z DEBU [BLOB]      Got enough success writes for 6f60a0a25ea6642186587eab7dd7ead7b15f7c0cb269ed0ebe0113debd0310a3 [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:19Z DEBU [IMPORTER]  Imported gravitational.io/web-assets:6.2.5 in 250.241564ms. dir:/opt/gravity-import process/import.go:188
2019-12-18T05:18:19Z DEBU [IMPORTER]  Closing backend. dir:/opt/gravity-import process/import.go:109
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] generate received request runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] received CSR runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] generating key: rsa-2048 runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] encoded CSR runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] signed certificate with serial number 262540636618560386158188552329125601178111394437 runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] generate received request runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] received CSR runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] generating key: rsa-2048 runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] encoded CSR runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] signed certificate with serial number 289125573576691353172630590606569663296411103074 runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] generate received request runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] received CSR runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] generating key: rsa-2048 runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] encoded CSR runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z INFO             2019/12/18 05:18:19 [INFO] signed certificate with serial number 22651691747850937588004698011244423357514672252 runtime/asm_amd64.s:1337
2019-12-18T05:18:19Z DEBU [BLOB]      Got enough success writes for 70ee955a415090a9dd8e959804b87cbb9050d2ac2f3018b549503fcc018851d1 [172.31.40.31]. addr:https://172.31.40.31:3009 id:172.31.40.31 cluster/cluster.go:305
2019-12-18T05:18:19Z INFO [PROCESS]   Initialized RPC credentials: gravitational.io/rpcagent-secrets:0.0.1. mode:site process/process.go:489
configmap/gravity-opscenter created
Creating cluster configmap
configmap/gravity-site created
serviceaccount/gravity-site created
role.rbac.authorization.k8s.io/gravity-site created
rolebinding.rbac.authorization.k8s.io/gravity-site created
clusterrole.rbac.authorization.k8s.io/gravity-site created
clusterrolebinding.rbac.authorization.k8s.io/gravity-site created
daemonset.apps/gravity-site created
service/gravity-site created
role.rbac.authorization.k8s.io/gravity-site created
rolebinding.rbac.authorization.k8s.io/gravity-site created
Pod "gravity-install-fcef2f-m7jcj" in namespace "kube-system", has changed state from "Running" to "Succeeded".
Container "gravity-install" changed status from "running" to "terminated, exit code 0".

Wed Dec 18 05:18:27 UTC [INFO] [aws08] Executing postInstall hook for site:6.2.5.
Created Pod "site-app-post-install-71d352-rh4n8" in namespace "kube-system".

Container "post-install-hook" created, current state is "waiting, reason PodInitializing".

Pod "site-app-post-install-71d352-rh4n8" in namespace "kube-system", has changed state from "Pending" to "Running".
Container "post-install-hook" changed status from "waiting, reason PodInitializing" to "running".

[ERROR]: failed connecting to https://gravity-site.kube-system.svc.cluster.local:3009/healthz
Container "post-install-hook" changed status from "running" to "terminated, exit code 255".

Container "post-install-hook" restarted, current state is "running".

[ERROR]: failed connecting to https://gravity-site.kube-system.svc.cluster.local:3009/healthz
Container "post-install-hook" changed status from "running" to "terminated, exit code 255".

Container "post-install-hook" changed status from "terminated, exit code 255" to "waiting, reason CrashLoopBackOff".

Pod "site-app-post-install-71d352-rh4n8" in namespace "kube-system", has changed state from "Running" to "Succeeded".
Container "post-install-hook" restarted, current state is "terminated, exit code 0".

Wed Dec 18 05:18:53 UTC [INFO] [aws08] Executing phase: /runtime/kubernetes.
Wed Dec 18 05:18:53 UTC [DEBUG] [aws08] Application gravitational.io/kubernetes:6.2.5 does not have install hook.
Wed Dec 18 05:18:53 UTC [DEBUG] [aws08] Application gravitational.io/kubernetes:6.2.5 does not have postInstall hook.
Wed Dec 18 05:18:55 UTC [INFO] [aws08] Executing phase: /app/sygmoid-installer.
Wed Dec 18 05:18:55 UTC [INFO] [aws08] Executing install hook for sygmoid-installer:2.2.0.
Created Pod "install-60b3ee-782pn" in namespace "kube-system".
Container "install" created, current state is "waiting, reason PodInitializing".

Pod "install-60b3ee-782pn" in namespace "kube-system", has changed state from "Pending" to "Running".

Container "install" restarted, current state is "waiting, reason RunContainerError".

Container "install" restarted, current state is "waiting, reason RunContainerError".

Container "install" restarted, current state is "waiting, reason RunContainerError".

Container "install" restarted, current state is "waiting, reason RunContainerError".

Container "install" restarted, current state is "waiting, reason RunContainerError".

Container "install" restarted, current state is "waiting, reason RunContainerError".

Wed Dec 18 05:25:00 UTC [ERROR] [aws08] Phase execution failed: Job has reached the specified backoff limit, gravitational.io/sygmoid-installer:2.2.0 install hook failed.

I am able to connect to https://gravity-site.kube-system.svc.cluster.local:3009/healthz from inside the pod

ubuntu@aws08:~/workspace/gravity$ kubectl run new -it --rm --image=alpine
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
If you don't see a command prompt, try pressing enter.

/ # nslookup kubernetes.default.svc.cluster.local.
nslookup: can't resolve '(null)': Name does not resolve

Name:      kubernetes.default.svc.cluster.local.
Address 1: 10.100.0.1 kubernetes.default.svc.cluster.local

/ # apk add curl
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/community/x86_64/APKINDEX.tar.gz
(1/4) Installing ca-certificates (20190108-r0)
(2/4) Installing nghttp2-libs (1.39.2-r0)
(3/4) Installing libcurl (7.66.0-r0)
(4/4) Installing curl (7.66.0-r0)
Executing busybox-1.30.1-r2.trigger
Executing ca-certificates-20190108-r0.trigger
OK: 7 MiB in 18 packages

/ # curl -k https://gravity-site.kube-system.svc.cluster.local:3009/healthz 
{"info":"service is up and running","status":"ok"}/ # 
Herman-Levin commented 4 years ago

In my case i had IPV6 working and my coredns pod went crazy because of it not starting the cluster. and disabling it resolved the issue for me

dunefro commented 4 years ago

@Herman-Levin Turns out that I didn't make the script executable. I had to do so because I was using /bin/bash '-c' option.

prco5 commented 4 years ago

I have the same problem, how did you solve it actually?