openshift / openshift-ansible

Install and config an OpenShift 3.x cluster
https://try.openshift.com
Apache License 2.0
2.18k stars 2.31k forks source link

Installation fails on TASK [openshift_service_catalog : Verify that the catalog api server is running] #10653

Closed giraldo925 closed 4 years ago

giraldo925 commented 6 years ago

Trying to do a 3 node cluster on fresh installations of RHEL 7.6. My installation gets stuck on FAILED - RETRYING: Verify that the catalog api server is running and eventually fails. Any help/hints provided would be extremely useful. See logs below:

`FAILED - RETRYING: Verify that the catalog api server is running (1 retries left). fatal: [master.rhel.io]: FAILED! => {"attempts": 60, "changed": false, "cmd": ["curl", "-k", "https://apiserver.kube-service-catalog.svc/healthz"], "delta": "0:00:01.019324", "end": "2018-11-09 10:44:10.012405", "msg": "non-zero return code", "rc": 7, "start": "2018-11-09 10:44:08.993081", "stderr": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0curl: (7) Failed connect to apiserver.kube-service-catalog.svc:443; Connection refused", "stderr_lines": [" % Total % Received % Xferd Average Speed Time Time Time Current", " Dload Upload Total Spent Left Speed", "", " 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0", " 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0curl: (7) Failed connect to apiserver.kube-service-catalog.svc:443; Connection refused"], "stdout": "", "stdout_lines": []} ...ignoring

TASK [openshift_service_catalog : Check status in the kube-service-catalog namespace] ***** changed: [master.rhel.io]

TASK [openshift_service_catalog : debug] ** ok: [master.rhel.io] => { "msg": [ "In project kube-service-catalog on server https://master.rhel.io:8443", "", "https://apiserver-kube-service-catalog.router.default.svc.cluster.local (passthrough) to pod port secure (svc/apiserver)", " daemonset/apiserver manages registry.redhat.io/openshift3/ose-service-catalog:v3.11.16", " generation #1 running for 12 minutes - 0/1 pods growing to 1", " pod/apiserver-gfpvz runs registry.redhat.io/openshift3/ose-service-catalog:v3.11.16", "", "svc/controller-manager - 172.30.77.30:443 -> 6443", " daemonset/controller-manager manages registry.redhat.io/openshift3/ose-service-catalog:v3.11.16", " generation #1 running for 11 minutes - 0/1 pods growing to 1", " pod/controller-manager-k4sfz runs registry.redhat.io/openshift3/ose-service-catalog:v3.11.16", "", "Errors:", " pod/apiserver-gfpvz is crash-looping", " pod/controller-manager-k4sfz is crash-looping", "", "2 errors, 2 infos identified, use 'oc status --suggest' to see details." ] }

TASK [openshift_service_catalog : Get pods in the kube-service-catalog namespace] ***** changed: [master.rhel.io]

TASK [openshift_service_catalog : debug] ** ok: [master.rhel.io] => { "msg": [ "NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE", "apiserver-gfpvz 0/1 CrashLoopBackOff 6 12m 10.128.0.55 master.rhel.io ", "controller-manager-k4sfz 0/1 CrashLoopBackOff 7 11m 10.128.0.56 master.rhel.io " ] }

TASK [openshift_service_catalog : Get events in the kube-service-catalog namespace] *** changed: [master.rhel.io]

TASK [openshift_service_catalog : debug] ** ok: [master.rhel.io] => { "msg": [ "LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE", "12m 12m 1 apiserver.15657e7e6110531a DaemonSet Normal SuccessfulCreate daemonset-controller Created pod: apiserver-gfpvz", "12m 12m 1 apiserver-gfpvz.15657e7f07573a40 Pod spec.containers{apiserver} Normal Pulling kubelet, master.rhel.io pulling image \"registry.redhat.io/openshift3/ose-service-catalog:v3.11.16\"", "12m 12m 1 apiserver-gfpvz.15657e7f6020eace Pod spec.containers{apiserver} Normal Pulled kubelet, master.rhel.io Successfully pulled image \"registry.redhat.io/openshift3/ose-service-catalog:v3.11.16\"", "11m 11m 1 controller-manager.15657e8048a774fa DaemonSet Normal SuccessfulCreate daemonset-controller Created pod: controller-manager-k4sfz", "11m 11m 3 controller-manager-k4sfz.15657e8055ec7a6d Pod Warning FailedMount kubelet, master.rhel.io MountVolume.SetUp failed for volume \"service-catalog-ssl\" : secrets \"controllermanager-ssl\" not found", "11m 11m 1 service-catalog-controller-manager.15657e81ba50acd3 ConfigMap Normal LeaderElection service-catalog-controller-manager controller-manager-k4sfz-external-service-catalog-controller became leader", "11m 11m 1 service-catalog-controller-manager.15657e82061f9e7a ConfigMap Normal LeaderElection service-catalog-controller-manager controller-manager-k4sfz-external-service-catalog-controller became leader", "11m 11m 1 service-catalog-controller-manager.15657e8564428855 ConfigMap Normal LeaderElection service-catalog-controller-manager controller-manager-k4sfz-external-service-catalog-controller became leader", "11m 11m 1 service-catalog-controller-manager.15657e8b3859dbb5 ConfigMap Normal LeaderElection service-catalog-controller-manager controller-manager-k4sfz-external-service-catalog-controller became leader", "10m 10m 1 service-catalog-controller-manager.15657e94faa5664c ConfigMap Normal LeaderElection service-catalog-controller-manager controller-manager-k4sfz-external-service-catalog-controller became leader", "10m 11m 5 controller-manager-k4sfz.15657e81a8c4dcbd Pod spec.containers{controller-manager} Normal Pulled kubelet, master.rhel.io Container image \"registry.redhat.io/openshift3/ose-service-catalog:v3.11.16\" already present on machine", "10m 11m 5 controller-manager-k4sfz.15657e81aade086a Pod spec.containers{controller-manager} Normal Created kubelet, master.rhel.io Created container", "10m 11m 5 controller-manager-k4sfz.15657e81b57d0f1a Pod spec.containers{controller-manager} Normal Started kubelet, master.rhel.io Started container", "9m 11m 4 apiserver-gfpvz.15657e8233e9a0ff Pod spec.containers{apiserver} Normal Pulled kubelet, master.rhel.io Container image \"registry.redhat.io/openshift3/ose-service-catalog:v3.11.16\" already present on machine", "9m 12m 5 apiserver-gfpvz.15657e7f6deb1968 Pod spec.containers{apiserver} Normal Started kubelet, master.rhel.io Started container", "9m 12m 5 apiserver-gfpvz.15657e7f628e9035 Pod spec.containers{apiserver} Normal Created kubelet, master.rhel.io Created container", "9m 9m 1 service-catalog-controller-manager.15657ea7d9f5445b ConfigMap Normal LeaderElection service-catalog-controller-manager controller-manager-k4sfz-external-service-catalog-controller became leader", "6m 6m 1 service-catalog-controller-manager.15657ecd90595381 ConfigMap Normal LeaderElection service-catalog-controller-manager controller-manager-k4sfz-external-service-catalog-controller became leader", "1m 11m 40 apiserver-gfpvz.15657e851658b4f1 Pod spec.containers{apiserver} Warning BackOff kubelet, master.rhel.io Back-off restarting failed container", "1m 11m 47 controller-manager-k4sfz.15657e8230aba33c Pod spec.containers{controller-manager} Warning BackOff kubelet, master.rhel.io Back-off restarting failed container", "1m 1m 1 service-catalog-controller-manager.15657f1508c57123 ConfigMap Normal LeaderElection service-catalog-controller-manager controller-manager-k4sfz-external-service-catalog-controller became leader" ] }

TASK [openshift_service_catalog : Get pod logs] *** changed: [master.rhel.io]

TASK [openshift_service_catalog : debug] ** ok: [master.rhel.io] => { "msg": [ "I1109 15:39:12.758703 1 feature_gate.go:194] feature gates: map[OriginatingIdentity:true]", "I1109 15:39:12.758924 1 feature_gate.go:194] feature gates: map[OriginatingIdentity:true NamespacedServiceBroker:true]", "I1109 15:39:12.758971 1 hyperkube.go:192] Service Catalog version v3.11.16;Upstream:v0.1.31 (built 2018-09-26T12:57:39Z)", "W1109 15:39:13.313697 1 util.go:112] OpenAPI spec will not be served", "I1109 15:39:13.316855 1 util.go:182] Admission control plugin names: [NamespaceLifecycle MutatingAdmissionWebhook ValidatingAdmissionWebhook ServicePlanChangeValidator BrokerAuthSarCheck DefaultServicePlan ServiceBindingsLifecycle]", "I1109 15:39:13.317541 1 plugins.go:158] Loaded 6 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,MutatingAdmissionWebhook,ServicePlanChangeValidator,BrokerAuthSarCheck,DefaultServicePlan,ServiceBindingsLifecycle.", "I1109 15:39:13.317561 1 plugins.go:161] Loaded 1 validating admission controller(s) successfully in the following order: ValidatingAdmissionWebhook.", "I1109 15:39:13.320192 1 storage_factory.go:285] storing {servicecatalog.k8s.io clusterservicebrokers} in servicecatalog.k8s.io/v1beta1, reading as servicecatalog.k8s.io/internal from storagebackend.Config{Type:\"\", Prefix:\"/registry\", ServerList:[]string{\"https://master.rhel.io:2379\"}, KeyFile:\"/etc/origin/master/master.etcd-client.key\", CertFile:\"/etc/origin/master/master.etcd-client.crt\", CAFile:\"/etc/origin/master/master.etcd-ca.crt\", Quorum:true, Paging:true, DeserializationCacheSize:0, Codec:runtime.Codec(nil), Transformer:value.Transformer(nil), CompactionInterval:300000000000, CountMetricPollPeriod:60000000000}", "I1109 15:39:13.320252 1 storage_factory.go:285] storing {servicecatalog.k8s.io clusterserviceclasses} in servicecatalog.k8s.io/v1beta1, reading as servicecatalog.k8s.io/__internal from storagebackend.Config{Type:\"\", Prefix:\"/registry\", ServerList:[]string{\"https://master.rhel.io:2379\"}, KeyFile:\"/etc/origin/master/master.etcd-client.key\", CertFile:\"/etc/origin/master/master.etcd-client.crt\", CAFile:\"/etc/origin/master/master.etcd-ca.crt\", Quorum:true, Paging:true, DeserializationCacheSize:0, Codec:runtime.Codec(nil), Transformer:value.Transformer(nil), CompactionInterval:300000000000, CountMetricPollPeriod:60000000000}", "I1109 15:39:13.320305 1 storage_factory.go:285] storing {servicecatalog.k8s.io clusterserviceplans} in servicecatalog.k8s.io/v1beta1, reading as servicecatalog.k8s.io/internal from storagebackend.Config{Type:\"\", Prefix:\"/registry\", ServerList:[]string{\"https://master.rhel.io:2379\"}, KeyFile:\"/etc/origin/master/master.etcd-client.key\", CertFile:\"/etc/origin/master/master.etcd-client.crt\", CAFile:\"/etc/origin/master/master.etcd-ca.crt\", Quorum:true, Paging:true, DeserializationCacheSize:0, Codec:runtime.Codec(nil), Transformer:value.Transformer(nil), CompactionInterval:300000000000, CountMetricPollPeriod:60000000000}", "I1109 15:39:13.320350 1 storage_factory.go:285] storing {servicecatalog.k8s.io serviceinstances} in servicecatalog.k8s.io/v1beta1, reading as servicecatalog.k8s.io/__internal from storagebackend.Config{Type:\"\", Prefix:\"/registry\", ServerList:[]string{\"https://master.rhel.io:2379\"}, KeyFile:\"/etc/origin/master/master.etcd-client.key\", CertFile:\"/etc/origin/master/master.etcd-client.crt\", CAFile:\"/etc/origin/master/master.etcd-ca.crt\", Quorum:true, Paging:true, DeserializationCacheSize:0, Codec:runtime.Codec(nil), Transformer:value.Transformer(nil), CompactionInterval:300000000000, CountMetricPollPeriod:60000000000}", "I1109 15:39:13.320460 1 storage_factory.go:285] storing {servicecatalog.k8s.io servicebindings} in servicecatalog.k8s.io/v1beta1, reading as servicecatalog.k8s.io/__internal from storagebackend.Config{Type:\"\", Prefix:\"/registry\", ServerList:[]string{\"https://master.rhel.io:2379\"}, KeyFile:\"/etc/origin/master/master.etcd-client.key\", CertFile:\"/etc/origin/master/master.etcd-client.crt\", CAFile:\"/etc/origin/master/master.etcd-ca.crt\", Quorum:true, Paging:true, DeserializationCacheSize:0, Codec:runtime.Codec(nil), Transformer:value.Transformer(nil), CompactionInterval:300000000000, CountMetricPollPeriod:60000000000}", "F1109 15:39:23.322488 1 storage_decorator.go:57] Unable to create storage backend: config (&{ /registry [https://master.rhel.io:2379] /etc/origin/master/master.etcd-client.key /etc/origin/master/master.etcd-client.crt /etc/origin/master/master.etcd-ca.crt true true 0 {0xc420354880 0xc420354900} 5m0s 1m0s}), err (dial tcp 178.21.130.60:2379: getsockopt: connection refused)" ] }`

openshift-bot commented 4 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot commented 4 years ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

openshift-bot commented 4 years ago

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci-robot commented 4 years ago

@openshift-bot: Closing this issue.

In response to [this](https://github.com/openshift/openshift-ansible/issues/10653#issuecomment-664727046): >Rotten issues close after 30d of inactivity. > >Reopen the issue by commenting `/reopen`. >Mark the issue as fresh by commenting `/remove-lifecycle rotten`. >Exclude this issue from closing again by commenting `/lifecycle frozen`. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.