Closed Moep90 closed 6 years ago
@Moep90 We need to understand why the master failed to start. Take a look at systemctl status origin-master.service
and/or journalctl -lu origin-master.service
on the master.
I have seen this kind of problem if openshift_cloudprovider_kind doesn't get set.
I've got the sampe problem as above, below is my inventory and journalctl
May 21 23:23:08 localhost.localdomain systemd[1]: origin-master.service: main process exited, code=exited, status=255/n/a May 21 23:23:08 localhost.localdomain systemd[1]: Failed to start Origin Master Service. May 21 23:23:08 localhost.localdomain systemd[1]: Unit origin-master.service entered failed state. May 21 23:23:08 localhost.localdomain systemd[1]: origin-master.service failed. May 21 23:23:13 localhost.localdomain systemd[1]: origin-master.service holdoff time over, scheduling restart. May 21 23:23:13 localhost.localdomain systemd[1]: Starting Origin Master Service... May 21 23:23:13 localhost.localdomain origin-master[3420]: W0521 23:23:13.878542 3420 start_master.go:291] Warning: assetConfig.loggingPublicURL: Invalid value: "": required to view aggregated container logs in the console, maste May 21 23:23:13 localhost.localdomain origin-master[3420]: W0521 23:23:13.878722 3420 start_master.go:291] Warning: assetConfig.metricsPublicURL: Invalid value: "": required to view cluster metrics in the console, master start wi May 21 23:23:13 localhost.localdomain origin-master[3420]: W0521 23:23:13.878739 3420 start_master.go:291] Warning: auditConfig.auditFilePath: Required value: audit can now be logged to a separate file, master start will continue May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.889494 3420 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.890936 3420 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.892251 3420 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.893476 3420 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.894731 3420 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.896221 3420 admission.go:107] Admission plugin ProjectRequestLimit is not enabled. It will not be started. May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.896243 3420 admission.go:107] Admission plugin openshift.io/RestrictSubjectBindings is not enabled. It will not be started. May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.896254 3420 admission.go:107] Admission plugin PodNodeConstraints is not enabled. It will not be started. May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.896290 3420 admission.go:107] Admission plugin RunOnceDuration is not enabled. It will not be started. May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.896303 3420 admission.go:107] Admission plugin PodNodeConstraints is not enabled. It will not be started. May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.896320 3420 admission.go:107] Admission plugin ClusterResourceOverride is not enabled. It will not be started. May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.898864 3420 admission.go:107] Admission plugin ImagePolicyWebhook is not enabled. It will not be started. May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.898949 3420 admission.go:107] Admission plugin AlwaysPullImages is not enabled. It will not be started. May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.898960 3420 admission.go:107] Admission plugin LimitPodHardAntiAffinityTopology is not enabled. It will not be started. May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.905521 3420 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.906530 3420 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.907709 3420 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.908924 3420 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.909395 3420 plugins.go:94] No cloud provider specified. May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.913317 3420 master_config.go:367] Using the lease endpoint reconciler May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.914291 3420 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.915303 3420 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.915391 3420 start_master.go:410] Starting master on 0.0.0.0:8443 (v1.5.0) May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.915410 3420 start_master.go:411] Public master address is https://openshift.app:8443 May 21 23:23:13 localhost.localdomain origin-master[3420]: I0521 23:23:13.915433 3420 start_master.go:415] Using images from "openshift/origin-
:v1.5.0" May 21 23:23:13 localhost.localdomain origin-master[3420]: E0521 23:23:13.920892 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.User: client: etcd cluster is unavailable or misco May 21 23:23:13 localhost.localdomain origin-master[3420]: E0521 23:23:13.920976 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.OAuthAccessToken: client: etcd cluster is unavaila May 21 23:23:13 localhost.localdomain origin-master[3420]: E0521 23:23:13.921070 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/resourcequota/resource_access.go:83: Failed to list May 21 23:23:13 localhost.localdomain origin-master[3420]: E0521 23:23:13.921152 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/storageclass/default/admission.go:75: Failed to lis May 21 23:23:13 localhost.localdomain origin-master[3420]: E0521 23:23:13.926106 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/serviceaccount/admission.go:119: Failed to list ap May 21 23:23:13 localhost.localdomain origin-master[3420]: E0521 23:23:13.926199 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/serviceaccount/admission.go:103: Failed to list ap May 21 23:23:13 localhost.localdomain origin-master[3420]: E0521 23:23:13.926278 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.Group: client: etcd cluster is unavailable or misc May 21 23:23:13 localhost.localdomain origin-master[3420]: E0521 23:23:13.926375 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.PolicyBinding: client: etcd cluster is unavailable May 21 23:23:13 localhost.localdomain origin-master[3420]: E0521 23:23:13.926451 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.Policy: client: etcd cluster is unavailable or mis May 21 23:23:13 localhost.localdomain origin-master[3420]: E0521 23:23:13.926531 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.ClusterPolicyBinding: client: etcd cluster is unav May 21 23:23:13 localhost.localdomain origin-master[3420]: E0521 23:23:13.926636 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.ClusterPolicy: client: etcd cluster is unavailable May 21 23:23:14 localhost.localdomain origin-master[3420]: E0521 23:23:14.923085 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.OAuthAccessToken: client: etcd cluster is unavaila May 21 23:23:14 localhost.localdomain origin-master[3420]: E0521 23:23:14.923159 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.User: client: etcd cluster is unavailable or misco May 21 23:23:14 localhost.localdomain origin-master[3420]: E0521 23:23:14.923230 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/resourcequota/resource_access.go:83: Failed to list May 21 23:23:14 localhost.localdomain origin-master[3420]: E0521 23:23:14.927354 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.Policy: client: etcd cluster is unavailable or mis May 21 23:23:14 localhost.localdomain origin-master[3420]: E0521 23:23:14.927416 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.PolicyBinding: client: etcd cluster is unavailable May 21 23:23:14 localhost.localdomain origin-master[3420]: E0521 23:23:14.927468 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.Group: client: etcd cluster is unavailable or misc May 21 23:23:16 localhost.localdomain origin-master[3420]: E0521 23:23:16.933851 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/serviceaccount/admission.go:119: Failed to list ap May 21 23:23:16 localhost.localdomain origin-master[3420]: E0521 23:23:16.934687 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/serviceaccount/admission.go:103: Failed to list ap May 21 23:23:17 localhost.localdomain origin-master[3420]: E0521 23:23:17.933029 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.Group: client: etcd cluster is unavailable or misc May 21 23:23:17 localhost.localdomain origin-master[3420]: E0521 23:23:17.933141 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.PolicyBinding: client: etcd cluster is unavailable May 21 23:23:17 localhost.localdomain origin-master[3420]: E0521 23:23:17.933203 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.Policy: client: etcd cluster is unavailable or mis May 21 23:23:17 localhost.localdomain origin-master[3420]: E0521 23:23:17.933279 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/resourcequota/resource_access.go:83: Failed to list May 21 23:23:17 localhost.localdomain origin-master[3420]: E0521 23:23:17.933348 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.User: client: etcd cluster is unavailable or misco May 21 23:23:17 localhost.localdomain origin-master[3420]: E0521 23:23:17.933405 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.OAuthAccessToken: client: etcd cluster is unavaila May 21 23:23:17 localhost.localdomain origin-master[3420]: E0521 23:23:17.933457 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.ClusterPolicyBinding: client: etcd cluster is unav May 21 23:23:17 localhost.localdomain origin-master[3420]: E0521 23:23:17.937432 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/serviceaccount/admission.go:119: Failed to list ap May 21 23:23:17 localhost.localdomain origin-master[3420]: E0521 23:23:17.937517 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/storageclass/default/admission.go:75: Failed to lis May 21 23:23:17 localhost.localdomain origin-master[3420]: E0521 23:23:17.937582 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.ClusterPolicy: client: etcd cluster is unavailable May 21 23:23:17 localhost.localdomain origin-master[3420]: E0521 23:23:17.937661 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/serviceaccount/admission.go:103: Failed to list ap May 21 23:23:18 localhost.localdomain origin-master[3420]: E0521 23:23:18.936138 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.OAuthAccessToken: client: etcd cluster is unavaila May 21 23:23:18 localhost.localdomain origin-master[3420]: E0521 23:23:18.936223 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.User: client: etcd cluster is unavailable or misco May 21 23:23:18 localhost.localdomain origin-master[3420]: E0521 23:23:18.936293 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/resourcequota/resource_access.go:83: Failed to list May 21 23:23:18 localhost.localdomain origin-master[3420]: E0521 23:23:18.936347 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.Policy: client: etcd cluster is unavailable or mis May 21 23:23:18 localhost.localdomain origin-master[3420]: E0521 23:23:18.936405 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.PolicyBinding: client: etcd cluster is unavailable May 21 23:23:18 localhost.localdomain origin-master[3420]: E0521 23:23:18.936455 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.Group: client: etcd cluster is unavailable or misc May 21 23:23:18 localhost.localdomain origin-master[3420]: E0521 23:23:18.936505 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.ClusterPolicyBinding: client: etcd cluster is unav May 21 23:23:18 localhost.localdomain origin-master[3420]: E0521 23:23:18.939525 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/serviceaccount/admission.go:119: Failed to list ap May 21 23:23:18 localhost.localdomain origin-master[3420]: E0521 23:23:18.939590 3420 cacher.go:260] unexpected ListAndWatch error: pkg/storage/cacher.go:201: Failed to list api.ClusterPolicy: client: etcd cluster is unavailable May 21 23:23:18 localhost.localdomain origin-master[3420]: E0521 23:23:18.940850 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/storageclass/default/admission.go:75: Failed to lis May 21 23:23:18 localhost.localdomain origin-master[3420]: E0521 23:23:18.940942 3420 reflector.go:199] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/serviceaccount/admission.go:103: Failed to list ap May 21 23:23:19 localhost.localdomain origin-master[3420]: F0521 23:23:19.408706 3420 start_master.go:112] could not reach etcd: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 10.0.2.15:2379: getsockopt: May 21 23:23:19 localhost.localdomain systemd[1]: origin-master.service: main process exited, code=exited, status=255/n/a May 21 23:23:19 localhost.localdomain systemd[1]: Failed to start Origin Master Service.
and here is my inventory file
[OSEv3:children] masters nodes etcd
[OSEv3:vars] ansible_ssh_user=vagrant ansible_become=yes openshift_deployment_type=origin openshift_public_hostname=openshift.app openshift_master_default_subdomain=console.openshift.app
[masters] openshift.app [etcd] 192.168.1.5 [nodes] 192.168.1.6 192.168.1.7
Not sure if anyone has faced this issue again or has found a resolution. I am facing this issue intermittently. There are times the Ansible scripts just work fine but then there are other times. The symptom is that origin-master docker fails and when I look at it's log, I see that:
2017-05-25T01:07:55.865806000Z E0525 01:07:55.864022 1 reflector.go:203] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/limitranger/admission.go:154: Failed to list *api.LimitRange: Get https://master.org.com:8443/api/v1/limitranges?resourceVersion=0: dial tcp 10.200.1.68:8443: getsockopt: connection refused
2017-05-25T01:07:55.866014000Z E0525 01:07:55.865541 1 cacher.go:254] unexpected ListAndWatch error: pkg/storage/cacher.go:194: Failed to list *api.User: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 10.200.1.68:2379: getsockopt: connection refused
2017-05-25T01:07:55.979721000Z F0525 01:07:55.979072 1 start_master.go:108] could not reach etcd: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 10.200.1.68:2379: getsockopt: connection refused
But I guess underlying problem is with etcd - as the log for etcd is completely blank- which is indication of some sort of problem. So I compleletly remove the etcd container - and now it starts showing some activity. But still the etcd container is bound to 127.0.0.1 whereas the Openshift container tries to reach etcd at the IP address of the host (Single master+etcd & 2 node topology).
@vishal-biyani Thank you for sharing your solution. I have got the same problem with 3.6.0 after removing etcd now the problem is gone. Regards
Description
Single Master - Multi Node
Version
Steps To Reproduce
Observed Results
For long output or logs, consider using a gist
Additional Information